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CROSS-REFERENCES TO RELATED APPLICATIONS 
The present application is related to 60/128,606, filed April 8, 1999 and 
60/108,279, filed November 12, 1998, which are incorporated herein by reference. 



STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER 
FEDERALLY SPONSORED RESEARCH AND DEVELOPMENT 



FIELD OF THE INVENTION 
This invention relates to nucleic acids and polypeptides from Chlamydia 
pneumoniae and to their use in the diagnosis, prevention and treatment of diseases 
associated with C. pneumoniae, 

BACKGROUND OF THE INVENTION 

Chlamydiaceae is a family of obligate intracellular parasite with a tropism 
for epithelial cells lining the mucus membranes. The bacteria have two morphologically 
distinct forms, "elementary body" and "reticulate body". The elementary body is the 
infectious form, and has a rigid cell wall, primarily of cross-linked outer membrane 
proteins. The reticulate body is the intracellular, metabolically active form. A unique 
developmental cycle between these two forms characterizes Chlamydia growth. 

C. pneumoniae is a human respiratory pathogen that causes acute 
respiratory disease, and approximately 10% of community-acquired pneumonia. 
Antibody prevalence studies have shown that virtually everyone is infected with C. 
pneumoniae at some time, and that reinfection is common. In addition to respiratory 
disease, studies have shown aji association of this organism with coronary artery disease. 
It has been demonstrated in atherosclerotic lesions of the aorta and coronary arteries by 
immunocytochemistry and by polymerase chain reaction (Kuo et al (1993) J Infect Pis 
167(4):84 1-849). 

Recent reports have further demonstrated the presence of C pneumoniae 
in the walls of abdominal aortic aneurysms (Juvonen et al (1997) J Vase Surg 
25(3):499-505). Abdominal aortic aneurysms are frequently associated with 
atherosclerosis, and inflammation may be an important factor in aneurysmal dilatation. 

1 



C. pneumoniae may play a role in maintaining an inflammation and triggering the 
development of aortic aneurysms. 

Muhlestein et al. (1996) JACC 27:1555-61, reported a differential 
incidence of Chlamydia species within the coronary artery wall of patients with 
atherosclerosis versus those with other forms of cardiovascular disease. The extremely 
high rate of possible infection in patients with symptomatic atherosclerotic disease 
compared to the very low rate in patients with normal coronary arteries or coronary artery 
disease from chronic transplant rejection provides evidence for a direct link between the 
atherosclerotic process and Chlamydia infection. Because a history of chlamydial 
infection is so prevalent in the population, the issue of causality remains. On a 
physiologic and pathologic level, abnormal interactions among endothelial cells, platelets, 
macrophages and lymphocytes may lead to a cascade of events resulting in acute 
endothelial damage, thrombosis and repair, chronically leading to the development of 
atheroma in blood vessels. 

C. pneumoniae is related to other Chlamydia species, but the level of 
sequence similarity is relatively low. Very little is known about the biology of this 
organism, although it appears to be an important human pathogen. Allelic diversity and 
structural relationships between specific genes of Chlamydial species is described in 
Kaltenboeck et ai (1993) J Bacteriol 175(2):487-502; Gaydos et al (1992) Infect Immun 
60(12):53 19-5323; Everett et al. (1997) Int J Svst Bacteriol 47(2):46 1-473; and 
Pudjiatmoko et al. (1997) Int J Svst Bacteriol 47(2):425-431. 

A number of studies have been published describing methods for detection 
of C pneumoniae, and for distinguishing between Chlamydial species. Such methods 
include PCR detection (Rasmussen et al (1992) Mol Cell Probes 6(5):389-394; Holland 
et al. (1990) J Infect Pis 162(4):984-987); a simplified polymerase chain reaction-enzyme 
immunoassay (Wilson et al. (1996) J Appl Bacteriol 80(4):43 1-438); sequence 
determination and restriction endonuclease cleavage (Herrmann et ai (1996) J Clin 
Microbiol 34(8):1897-1902). 

- Antigenic and molecular analyses of different C pneumoniae strains is 
described in Jantos et al. (1997) J Clin Microbiol 35(3):620-623. Some genes of C. 
pneumoniae have been isolated and sequenced. These include the Gro E operon (Kikuta 
et al. (1991) Infect Immun 59(12):4665-4669); the major outer membrane protein Perez et 



al. (1991) Infect Immun 59(6):2 195-2 199; the DnaK protein homolog (Komak et al. 
(1991) Infect Immun 59(2):721-725); as well as a number of ribosomal and other genes. 



SUMMARY OF THE INVENTION 
This invention provides the genomic sequence of Chlamydia pneumoniae. 
The sequence information is useful for a variety of diagnostic and analytical methods. 
The genomic sequence may be embodied in a variety of media, including computer 
readable forms, or as a nucleic acid comprising a selected fragment of the sequence. 
Such fragments generally consist of an open reading frame, transcriptional or translational 
control elements, or fragments derived therefrom. Proteins encoded by the open reading 
frames are useful for diagnostic purposes, as well as for their enzymatic or structural 
activity. 

DEFINITIONS 

The term "amino acid" refers to naturally occurring and synthetic amino 
acids, as well as amino acid analogs and amino acid mimetics that function in a manner 
similar to the naturally occurring amino acids. Naturally occurring amino acids are those 
encoded by the genetic code, as well as those amino acids that are later modified, e.g., 
hydroxyproline, y-carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to 
compounds that have the same basic chemical structure as a naturally occurring amino 
acid, i.e., an a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and 
an R group., e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl 
sulfonium Such analogs have modified R groups (e.g., norleucine) or modified peptide 
backbones, but retain the same basic chemical structure as a naturally occurring amino 
acid. Amino acid mimetics refers to chemical compounds that have a structure that is 

different from the general chemical structure of an amino acid, but that functions in a 

manner similar to a naturally occurring amino acid. 

Amino acids may be referred to herein by either their commonly known 

three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB 

Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by 

their commonly accepted single-letter codes. 
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"Amplification" primers are oligonucleotides comprising either natural or 
analogue nucleotides that can serve as the basis for the amplification of a select nucle,c 
and sequence. They include, e.g., polymerase chain reaction primers and ligase chain 
reaction oligonucleotides. 

> "Antibody" refers to an immunoglobulin molecule able to bind to a 

specific epitope on an antigen. Antibodies can be a polyclonal mixture or monoclonal 
Antibodies can be intact immunoglobulins derived from natural sources or from 
recombinant sources and can be immunoreactive portions of intact immunoglobulins 
Antibod,es may exist m a variety of forms including, for example, Fv, F ab , and F(ab) 2) as 
well as in single chains. Single-chain antibodies, in which genes for a heavy chain and a 
light chain are combined into a single coding sequence, may also be used. 

An "antigen" is a molecule that is recognized and bound by an antibody 
eg., peptides, carbohydrates, organic molecules, or more complex molecules such as ' 
glycohpids and glycoproteins. The part of the antigen that is the target of antibody 
binding is an antigenic determinant and a small functional group that corresponds to a 
single antigenic determinant is called a hapten. 

"Biological sample" refers to any sample obtained from a living or dead 
organism. Examples of biological samples include biological fluids and tissue specimens 
Such b I0 logical samples can be prepared for analysis of the presence of C. pneumoniae 
nucleic acids, proteins, or antibodies specifically reactive with the proteins. 

The term "C pneumoniae gene" shall be intended to mean the open 
reading frame encoding specific C. pneumoniae polypeptides, as well as adjacent 5' and 
3' non-coding nucleotide sequences involved in the regulation of expression, up to about 
2 kb beyond the coding region, but possibly further in either direction. The gene may be 
mtroduced into an appropriate vector for extrachromosomal maintenance or for 
integration into a host genome. 

"Conservatively modified variants" applies to both amino acid and nucleic 
acid sequences. With respect to particular nucleic acid sequences, conservatively 
modified variants refers to those nucleic acids which encode- identical or essentially 
identical amino acid sequences, or where the nucleic acid does not encode an amino acid 
sequence, to essentially identical sequences. Specifically, degenerate codon substitutions 
may be acmeved by generating sequences in which the third position of one or more 
selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues 
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™ZtT'T d Res 19:5081 (,991,; 0btsMu ** «~ ™™ 

2«08 (1985); Rossolmt e,a,., Mol CM. Probes 8:9.-98 (1994,,. Because of the 
degeneracy of the genetic code, a large number of funcuonally identical „ uclac acids 
encode any given pro,ein. For instance, ,he codons OCA, GCC, GCG and GCU al, 
■ encode the amino acid alamne. Thus, a, every position where an a la „i„e ,s specified by a 
codon. ,he codon can be altered to any of the corresponding codons descnbed without 
ahenng the encoded poiypeptidc. Such nucleic acid variations are "siient vacations ■ 
whtch are one spec.es of conservative,, modified vanations. Every „ uc ,e,c acid sconce 
heretn whtch encodes a poiypeptide a ,so descnbes every P oss,b,c silen, vanation of the 
-etc act One of s klI , win recognize that each codon i„ a nucieic acid (except AUG 
w« „ ordmanly the „n,y codon for Conine, and TGG, whtch is ordinary the ^ 
odon for tryptophan) car, be modified to yie,d a functionally identica, mo,ecu,e 
Wdtngly, each stien: vanation of a nuc.etc acid which encodes a poiypeptide is 
implicit in each describ A sequence. 

As to amino acid sequences, one of skill will recognize tha, individuai 
substitutes demons or Editions to a nuCeic acid, peptide, poiypeptide, or P ro,e ! 

ctd th encoded sequence ,s a "conservatively modified variant- where the action 
-Its ,» the substitution of a „ amino acid with a chenticaUy similar ammo acid 
Conserve substitute , ab ,es p r ov,di ng faconally similar amino acids are we,, 
ta» m ■ art. Such conservattvely modified van™ are i„ addillon „ do „„, 
exclude polymorphic v ana „ t s, mterspeces homoiogs, and alleles of the invention 

The following groups each contain ammo acds tha, are conservative 
substitutions for one Mother: servanve 

1) Alanine (A,, Glycine (G); 

2) Serine (S,, Threonine (T,; 

3) Aspartic acid (D,, Glutamic acid (E,; 

4 ) AsparagmerNXGIutaminefQ); 
-■- .....J) - Cysteine (C,, Methionine (M,; - - -• ■- 

«) Arginine (R), Lysine (K), Histidine (H); 
T) Isoleucine 1,1), Leucine (L), Valine (V); and 
8) Phenylalanine (F), Tyrosine (Y), Tryptophan (W). 
"e. e.g., Creighton, Proteins (1984)). 



The terms "identical" or percent "identity," in the context of two or more 
nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences 
that are the same or have a specified percentage of amino acid residues or nucleotides that 
are the same, when compared and aligned for maximum correspondence over a 
comparison window, as measured using one of the following sequence comparison 
algorithms or by manual alignment and visual inspection. This definition also refers to 
the complement of a test sequence, which has a designated percent sequence or 
subsequence complementarity when the test sequence has a designated or substantial 
identity to a reference sequence. For example, a designated amino acid percent identity 
of 95% refers to sequences or subsequences that have at least about 95% amino acid 
identity when aligned for maximum correspondence over a comparison window as 
measured using one of the following sequence comparison algorithms or by manual 
alignment and visual inspection. Such sequences would then be said to have substantial 
identity, or to be substantially identical to each other. Preferably, sequences have at least 
about 70% identity, more preferably 80% identity, more preferably 90-95% identity and 
above. Preferably, the percent identity exists over a region of the sequence that is at least 
about 25 amino acids in length, more preferably over a region that is 50-100 amino acids 
in length. 

When percentage of sequence identity is used in reference to proteins or 
peptides, it is recognized that residue positions that are not identical often differ by 
conservative amino acid substitutions, where amino acids residues are substituted for 
other amino acid residues with similar chemical properties (e.g., charge or 
hydrophobicity) and therefore do not change the functional properties of the molecule. 
Where sequences differ in conservative substitutions, the percent sequence identity may 
be adjusted upwards to correct for the conservative nature of the substitution. Means for 
making this adjustment are well known to those of skill in the art. Typically this involves 
scoring a conservative substitution as a partial rather than a full mismatch, thereby 
increasing the percentage sequence identity. Thus, for example, where an identical amino 
acid is given a score of 1 and a~non-conservati ve substitution is gi ven a score of zero, a 
conservative substitution is given a score between zero and 1 . The scoring of 
conservative substitutions is calculated according to, e.g., the algorithm of Meyers & 
Miller, Computer Applic. Biol ScL 4:1 1-17 (1988) e.g., as implemented in the program 
PC/GENE (Intelligenetics, Mountain View, California, USA),. 



For sequence comparison, typically one sequence acts as a reference 
sequence, to which test sequences are compared. When using a sequence comparison 
algorithm, test and reference sequences are entered into a computer, subsequence 
coordinates are designated, if necessary, and sequence algorithm program parameters are 
5 designated. Default program parameters can be used, or alternative parameters can be 
designated. The sequence comparison algorithm then calculates the percent sequence 
identity for the test sequence(s) relative to the reference sequence, based on the 
designated or default program parameters. 

A comparison window includes reference to a segment of any one of the 

10 number of contiguous positions selected from the group consisting of from 25 to 600, 
usually about 50 to about 200, more usually about 100 to about 150 in which a sequence 
may be compared to a reference sequence of the same number of contiguous positions 
after the two sequences are optimally aligned. Methods of alignment of sequences for 
comparison are well-known in the art. Optimal alignment of sequences for comparison 

15 can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. 
Appl. Math. 2:482 (1981), by the homology alignment algorithm of Needleman & 
Wunsch, /. Mol Biol 48:443 (1970), by the search for similarity method of Pearson & 
Lipman, Proc. Nat 'I Acad. Sci. USA 85:2444 (1988), by computerized implementations 
of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics 

20 Software Package, Genetics Computer Group, 575 Science Dr., Madison, WI), or by 
manual alignment and visual inspection (see, e.g., Ausubel et al, supra). 

One example of a useful algorithm is PILEUP. PILEUP creates a multiple 
sequence alignment from a group of related sequences using progressive, pairwise 
alignments to show relationship and percent sequence identity. It also plots a tree or 

25 dendogram showing the clustering relationships used to create the alignment. PILEUP 
uses a simplification of the progressive alignment method of Feng & Doolittle, Mol 
Evol. 35:35 1-360 (1987). The method used is similar to the method described by Higgins 
& Sharp, CABIOS 5:151-153 (1989). The program can align up to 300 sequences, each 
of a maximum length of 5,000 nucleotides or amino acids. The multiple alignment 

30 procedure begins with the pairwise alignment of the two most similar sequences, 

producing a cluster of two aligned sequences. This cluster is then aligned to the next 
most related sequence or cluster of aligned sequences. Two clusters of sequences are 
aligned by a simple extension of the pairwise alignment of two individual sequences. The 



final ahgnment is achieved by a senes of progressive, pairwise alignments. The program 
» run by designating specific sequences and their amino acid or nucleotide coordinates 
for regions of sequence comparison and by designating the program parameters Using 
PILEUP, a reference sequence is compared to other test sequences to determine the 
5 percent sequence identity relationship using the following parameters: default gap w eig ht 
(3.00), default gap length weight (0.10), and weighted end gaps. PILEUP can be obtained 
from the GCG sequence analysis software package, e.g, version 7.0 (Devereaux et al. 
Nuc. Acids Res. 12:387-395 (1984). 

Another example of algorithm that is suitable for determining percent 
10 sequence identity (i.e., substantial similarity or identity) is the BLAST algorithm which 
Q .s described in Altschul et al, J. Mol. Biol. 215:403-410 (1990). Software for performing 

BLAST analyses is publicly available through the National Center for Biotechnology 
Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying 
high sconng sequence pairs (HSPs) by identifying short words of length W in the query 
sequence, which either match or satisfy some positive-valued threshold score T when 
ahgned w,th a word of the same length in a database sequence. T is referred to as the 
neighborhood word score threshold (Altschu. et al, supra). These initial neighborhood 
Jt W0KI hltS ^ " for ^ find longer HSPs containing them The 

3 , n W0Ki ^ ^ thCn 6Xtended in both Actions along each sequence for as far as the 
« cumulative alignment score can be increased. Cumulative scores are calculated using for 
nucleotide sequences, the parameters M (reward score for a pair of matching residues' 
always > 0) and N (penalty score for mismatching residues, always < 0). For amino acid 
sequences, a sconng matrix is used to calculate the cumu.ative score. Extension of the 
word hits in each direction are halted when: the cumulative alignment score falls off by 
hequantityXfrom^^ 
elowd u ^ 

the end of either sequence is reached. The BLAST algorithm parameters W T andX 
determine the sensitivity and speed of the alignment. The BLASTN program (for 
...micleotide sequences)^ 

M=5, N=4, and a comparison of both strands. For amino acid sequences, the BLASTP ' 
program uses as default parameters a wordlength (W) of 3, an expectation (E) of ! 0 and 
the BLOSUM62 scoring matrix (,„ Henikcff & Henikoff, Proc. Natl. Acad Sci USA 
89:10915 (1989)). ' 
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The BLAST algorithm also performs a statistical analysis of the similarity 
between two sequences (see, e.g., Karlin & Altschul, Proc. Natl Acad. Sci. USA 
90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is 
the smallest sum probability (P(N)), which provides an indication of the probability by 
which a match between two nucleotide or amino acid sequences would occur by chance. 
For example, a nucleic acid is considered similar to a reference sequence if the smallest 
sum probability in a comparison of the test nucleic acid to the reference nucleic acid is 
less than about 0.1, more preferably less than about 0.01, and most preferably less than 
about 0.001. 

An indication that two nucleic acid sequences or polypeptides are 
substantially identical is that the polypeptide encoded by the first nucleic acid is 
immunologically cross reactive with the antibodies raised against the polypeptide 
encoded by the second nucleic acid, as described below. Thus, a polypeptide is typically 
substantially identical t„ a second polypeptide, for example, where the two peptides differ 
only by conservative substitutions. Another indication that two nucleic acid sequences 
are substantially identical is that the two molecules or their complements hybridize to 
each other under stringent conditions, as described below. 

Another indication that polynucleotide sequences are substantially 
identical is if two molecules hybridize to each other under stnngent conditions. Stringent 
conditions are sequence dependent and will be different in different circumstances. 
Generally, stringent conditions are selected to be about 5°C lower than the thermal 
melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm 
is the temperature (under defined ionic strength and pH) at which 50% of the target 
sequence hybridizes to a perfectly matched probe. Typically stringent conditions for a 
Southern blot protocol involve hybridizing in a buffer comprising 5x SSC, 1% SDS at 
65°C or hybridizing in a buffer containing 5x SSC and 1% SDS at 42°C and washing at 
65°C with a 0.2x SSC, 0. 1 % SDS wash. 

A "label" is a composition detectable by spectroscopic, photochemical, 
biochemical, immunochemical, or chemical means. For example, useful labels include 
32 P, fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonly used in an 
ELISA), biotin, dioxigenin, or haptens and proteins for which antisera or monoclonal 
antibodies are available. 



9 



The term "nucleic acid" refers to deoxyribonucleotides or ribonucleotides 
and polymers thereof in either single- or double-stranded form. The term encompasses 
nucleic acids containing known nucleotide analogs or modified backbone residues or 
linkages, which are synthetic, naturally occurring, and non-naturally occurring, which 
have similar binding properties as the reference nucleic acid, and which are metabolized 
in a manner similar to the reference nucleotides. Examples of such analogs include, 
without limitation, phosphorothioates, phosphoramidates, methyl phosphonates, 
chiral-methyl phosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs). 

Unless otherwise indicated, a particular nucleic acid sequence also 
implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon 
substitutions) and complementary sequences, as well as the sequence explicitly indicated. 
The term nucleic acid is used interchangeably with gene, cDNA, mRNAj oligonucleotide, 
and polynucleotide. 

As used herein a "nucleic acid probe or oligonucleotide" is defined as a 
nucleic acid capable of binding to a target nucleic acid of complementary sequence 
through one or more types of chemical bonds, usually through complementary base 
pairing, usually through hydrogen bond formation. As used herein, a probe may include 
natural (i.e., A, G, C, or T) or modified bases (7-deazaguanosine, inosine, etc.). In 
addition, the bases in a probe may be joined by a linkage other than a phosphodiester 
bond, so long as it does not interfere with hybridization. Thus, for example, probes may 
be peptide nucleic acids in which the constituent bases are joined by peptide bonds rather 
than phosphodiester linkages. It will be understood by one of skill in the art that probes 
may bind target sequences lacking complete complementarity with the probe sequence 
depending upon the stringency of the hybridization conditions. The probes are preferably 
directly labeled as with isotopes, chromophores, lumiphores, chromogens, or indirectly 
labeled such as with biotin to which a streptavidin complex may later bind. By assaying 
for the presence or absence of the probe, one can detect the presence or absence of the 
select sequence or subsequence. 

A labeled nucleic acid probe or oligonucleotide is one that is bound, either 
covalently, through a linker, or through ionic, van der Waals or hydrogen bonds to a label 
such that the presence of the probe may be detected by detecting the presence of the label 
bound to the probe. 

10 



"Pharmaceutically acceptable" means a material that is not biologically or 
otherwise undesirable, i.e., the material can be administered to an individual along with a 
Chlamydia antigen without causing any undesirable biological effects or interacting in a 
deleterious manner with any of the other components of the pharmaceutical composition. 
5 The terms "polypeptide," "peptide" and "protein" are used interchangeably 

herein to refer to a polymer of amino acid residues. The terms apply to amino acid 
polymers in which one or more amino acid residue is an analog or mimetic of a 
corresponding naturally occurring amino acid, as well as to naturally occurring amino 
acid polymers. 

10 The phrase "specifically or selectively hybridizing to," refers to 

hybridization between a probe and a target sequence in which the probe binds 
substantially only to the target sequence, forming a hybridization complex, when the 
target is in a heterogeneous mixture of polynucleotides and other compounds. Such 
hybridization is determinative of the presence of the target sequence. Although the probe 
15 may bind other unrelated sequences, at least 90%, preferably 95% or more of the 
hybridization complexes formed are with the target sequence. 
\& The term "recombinant" when used with reference to a cell, or nucleic 

JT acid, or vector, indicates that the cell, or nucleic acid, or vector, has been modified by the 

I 4, introduction of a heterologous nucleic acid or the alteration of a native nucleic acid, or 

,g 20 that the cell is derived from a cell so modified. Thus, for example, recombinant cells 

express genes that are not found within the native (non-recombinant) form of the cell or 
express native genes that are otherwise abnormally expressed, under expressed or not 
expressed at all. 

The phrase "specifically immunoreactive with", when referring to a protein 
25 or peptide, refers to a binding reaction between the protein and an antibody which is 

determinative of the presence of the protein in the presence of a heterogeneous population 
of proteins and other compounds. Thus, under designated immunoassay conditions, the 
specified antibodies bind to a particular protein and do not bind in a significant amount to 
other proteins present in the sample. Specific binding to an antibody under such 
30 conditions may require an antibody that is selected for its specificity for a particular 

protein. A variety of immunoassay formats may be used to select antibodies specifically 
immunoreactive with a particular protein and are described in detail below. 
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The phrase "substantially pure" or "isolated" when referring to a 
Chlamydia peptide or protein, means a chemical composition which is free of other 
subcellular components of the Chlamydia organism. Typically, a monomeric protein is 
substantially pure when at least about 85% or more of a sample exhibits a single 
polypeptide backbone. Minor variants or chemical modifications may typically share the . 
same polypeptide sequence. Depending on the purification procedure, purities of 85%, 
and preferably over 95% pure are possible. Protein purity or homogeneity may be 
indicated by a number of means well known in the art, such as polyacrylamide gel 
electrophoresis of a protein sample, followed by visualizing a single polypeptide band on 
a polyacrylamide gel upon silver staining. For certain purposes high resolution will be 
needed and HPLC or a similar means for purification utilized. 
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DETAILED DESCRIPTION 
The present invention provides thof nucleotide sequence of the C. 
pneumoniae genome SEQ ID NO: 1 or a representative fragment thereof, in a form which 
can be readily used, analyzed, and interpreted Ky a skilled artisan. As used herein, a 
"representative fragment" of the nucleotide sequence depicted in SEQ ID NO: 1 refers to 
any portion which is not presently represented within a publicly available database. 
Preferred representative fragments of the present invention are open reading frames, 
expression modulating fragments, uptake/modulating fragments, and fragments which can 
be used to diagnose the presence of C. pneumoniae in sample. Using the information 
provided in the present application, together with routine cloning and sequencing 
methods, one of ordinary skill in the art will be able to clone and sequence all 
"representative fragments" of interest including open reading frames (ORFs) encoding a 
large variety of C. pneumoniae proteins. A non-limiting identification of such preferred 
representative fragments is provided in *T ablc3 2 and : 



Diagnostic use of C pneumoniae nucleic acids 



Hvbridization-based assays 

Using the nucleic acids disclosed here, one of skill can design nucleic acid 
30 hybridization-based assays for the detection of C pneumoniae. Any of a number of well 
known techniques for the specific detection of target nucleic acids can be used. ■ 
Exemplary hybridization-based assays include, but are not limited to, traditional "direct 
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probe" methods such as Southern Blots, dot blots, in situ /zybridization (e.g., FISH), PCR, 
and the like. The methods can be used in a wide variety of formats including, but not 
limited to substrate- (e.g. membrane or glass) bound methods or array-based approaches 
as described below. As noted above, this invention also embraces methods for detecting 
5 the presence of Chlamydia DNA or RNA in biological samples. These sequences can be 
used to detect Chlamydia in biological samples from patients suspected of being infected. 
A variety of methods of specific DNA and RNA measurement using nucleic acid 
hybridization techniques are known to those of skill in the art (see Sambrook et al., 
supra). 

10 In situ hybridization assays are well known (e.g., Angerer (1987) Meth. 

Enzymol 152: 649). Generally, in situ hybridization comprises the following major steps: 
P (1) fixation of tissue or biological structure to analyzed; (2) prehybridization treatment of 

the biological structure to increase accessibility of target DNA, and to reduce nonspecific 
J*f binding; (3) hybridizatic n of the mixture of nucleic acids to the nucleic acid in the 

H 15 biological structure or tissue; (4) post-hybridization washes to remove nucleic acid 

IB 

fragments not bound in the hybridization and (5) detection of the hybridized nucleic acid 
* fragments. The reagent used in each of these steps and the conditions for use vary 

depending on the particular application. 

s s 

L . In a typical in situ hybridization assay, cells are fixed to a solid support, 

W 20 typically a glass slide. If a nucleic acid is to be probed, the cells are typically denatured 
with heat or alkali. The cells are then contacted with a hybridization solution at a 
moderate temperature to permit annealing of labeled probes specific to the nucleic acid 
sequence encoding the protein. The targets (e.g., cells) are then typically washed at a 
predetermined stringency or at an increasing stringency until an appropriate signal to 
25 noise ratio is obtained. 

The nucleic acids of this invention are particularly well suited to array- 
based hybridization formats. Arrays are a multiplicity of different "probe" or "target" 
nucleic acids (or other compounds) attached to one or more surfaces (e.g., solid, 
membrane, or gel). In a preferred embodiment, the multiplicity of nucleic acids (or other 
30 moieties) is attached to a single contiguous surface or to a multiplicity of surfaces 
juxtaposed to each other. 

In an array format a large number of different hybridization reactions can 
be run essentially "in parallel." This provides rapid, essentially simultaneous, evaluation 
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of a number of hybridizations in a single "experiment". Methods of performing 
hybridization reactions in array based formats are well known to those of skill in the art 
(see, e.g., Pastinen (1997) Genome Res. 7: 606-614; Jackson (1996) Nature 
Biotechnology 14:1685; Chee (1995) Science 274: 6.10; WO 96/17958. 

Arrays, particularly nucleic acid arrays can be produced according to a 
wide variety of methods well known to those of skill in the art. For example, in a simple 
embodiment, "low density" arrays can simply be produced by spotting (e.g. by hand using 
a pipette) different nucleic acids at different locations on a solid support (e.g. a glass 
surface, a membrane, etc.). 

This simple spotting, approach has been automated to produce high 
density spotted arrays (see, e.g., U.S. Patent No: 5,807,522). This patent describes the 
use of an automated systems that taps a microcapillary against a surface to deposit a small 
volume of a biological sample. The process is repeated to generate high density arrays. 
Arrays can also be produced using oligonucleotide synthesis technology. Thus, for 
example, U.S. Patent No. 5,143,854 and PCT patent publication Nos. WO 90/15070 and 
92/10092 teach the use of light-directed combinatorial synthesis of high density 
oligonucleotide arrays. 

Many methods for immobilizing nucleic acids on a variety of solid 
surfaces are known in the art. A wide variety of organic and inorganic polymers, as well 
as other materials, both natural and synthetic, can be employed as the material for the 
solid surface. Illustrative solid surfaces include, e.g., nitrocellulose, nylon, glass, quartz, 
diazotized membranes (paper or nylon), silicones, polyformaldehyde, cellulose, and 
cellulose acetate. In addition, plastics such as polyethylene, polypropylene, polystyrene, 
and the like can be used. Other materials which may be employed include paper, 
ceramics, metals, metalloids, semiconductive materials, cermets or the like. In addition, 
substances that form gels can be used. Such materials include, e.g., proteins (e.g., 
gelatins), lipopolysaccharides, silicates, agarose and polyacrylamides. Where the solid 
surface is porous, various pore sizes may be employed depending upon the nature of the 
system. 

In preparing the surface, a plurality of different materials may be 
employed, particularly as laminates, to obtain various properties. For example, proteins 
(e.g., bovine serum albumin) or mixtures of macromolecules (e.g., Denhardt's solution) 
can be employed to avoid non-specific binding, simplify covalent conjugation, enhance 
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signal detection or the like. If covalent bonding between a compound and the surface is 
desired, the surface will usually be polyfunctional or be capable of being 
polyfunctionalized. Functional groups which may be present on the surface and used for 
linking can include carboxylic acids, aldehydes, amino groups, cyano groups, ethylenic 
5 groups, hydroxyl groups, mercapto groups and the like. The manner of linking a wide 
variety of compounds to various surfaces is well known and is amply illustrated in the 
literature. 

For example, methods for immobilizing nucleic acids by introduction of 
various functional groups to the molecules is known (see, e.g., Bischoff (1987) Anal. 
10 Biochem., 164: 336-344; Kremsky (1987) Nucl Acids Res. 15: 2891-2910). Modified 
nucleotides can be placed on the target using PCR primers containing the modified 
nucleotide, or by enzymatic end labeling with modified nucleotides. Use of glass or 
iy membrane supports (e.g., nitrocellulose, nylon, polypropylene) for the nucleic acid arrays 

of the invention is advantageous because of well developed technology employing 
TO 1 5 manual and robotic methods of arraying targets at relatively high element densities. Such 
*" 3 membranes are generally available and protocols and equipment for hybridization to 

membranes is well known. 
H Target elements of various sizes, ranging from 1 mm diameter down to 1 

^ jim can be used. Smaller target elements containing low amounts of concentrated, fixed 

20 probe DNA are used for high complexity comparative hybridizations since the total 

amount of sample available for binding to each target element will be limited. Thus it is 
advantageous to have small array target elements that contain a small amount of 
concentrated probe DNA so that the signal that is obtained is highly localized and bright. 
Such small array target elements are typically used in arrays with densities greater than 
25 10 4 /cm 2 . Relatively simple approaches capable of quantitative fluorescent imaging of 1 
cm 2 areas have been described that permit acquisition of data from a large number of 
target elements in a single image (see, e.g., Wittrup (1994) Cytometry 16:206-213). 

If fluorescently labeled nucleic acid samples are used, arrays on solid 
surface substrates with much lower fluorescence than membranes, such as glass, quartz, 
30 or small beads, can achieve much better sensitivity. Substrates such as glass or fused 
silica are advantageous in that they provide a very low fluorescence substrate, and a 
highly efficient hybridization environment. Covalent attachment of the target nucleic 
acids to glass or synthetic fused silica can be accomplished according to a number of 
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known techniques (described above). Nucleic acids can be conveniently coupled to glass 
using commercially available reagents. For instance, materials for preparation of 
silanized glass with a number of functional groups are commercially available or can be 
prepared using standard techniques (see, e.g., Gait (1984) Oligonucleotide Synthesis: A 
Practical Approach, IRL Press, Wash., D.C.). Quartz cover slips, which have at least 10- 
fold lower autofluorescence than glass, can also be silanized. 

Alternatively, probes can also be immobilized on commercially available 
coated beads or other surfaces. For instance, biotin end-labeled nucleic acids can be 
bound to commercially available avidin-coated beads. Streptavidin or anti-digoxigenin 
antibody can also be attached to silanized glass slides by protein-mediated coupling using 
e.g., protein A following standard protocols (see, e.g., Smith (1992) Science 258: 1 122- 
1 126). Biotin or digoxigenin end-labeled nucleic acids can be prepared according to 
standard techniques. Hybridization to nucleic acids attached to beads is accomplished by 
suspending them in the hybridization mix, and then depositing them on the glass substrate 
for analysis after washing. Alternatively, paramagnetic particles, such as ferric oxide 
particles, with or without avidin coating, can be used. 

A variety of other nucleic acid hybridization formats are known to those 
skilled in the art. For example, common formats include sandwich assays and 
competition or displacement assays. Hybridization techniques are generally described in 
Hames and Higgins (1985) Nucleic Acid Hybridization, A Practical Approach, IRL Press; 
Gall and Pardue (1969) Proc. Natl. Acad. Sci. USA 63: 378-383; and John et ai (1969) 
Nature 223: 582-587. 

Sandwich assays are commercially useful hybridization assays for 
detecting or isolating nucleic acid sequences. Such assays utilize a "capture" nucleic acid 
covalently immobilized to a solid support and a labeled "signal" nucleic acid in solution. 
The sample will provide the target nucleic acid. The "capture" nucleic acid and "signal" 
nucleic acid probe hybridize with the target nucleic acid to form a "sandwich" 
hybridization complex. To be most effective, the signal nucleic acid should not hybridize 
with the capture nucleic acid. 

Detection of a hybridization complex may require the binding of a signal 
generating complex to a duplex of target and probe polynucleotides or nucleic acids. 
Typically, such binding occurs through ligand and anti-ligand interactions as between a 
ligand-conjugated probe and an anti-ligand conjugated with a signal. 
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The sensitivity of the hybridization assays may be enhanced through use of 
a nucleic acid amplification system that multiplies the target nucleic acid being detected. 
Examples of such systems include the polymerase chain reaction (PCR) system and the 
ligase chain reaction (LCR) system. Other methods recently described in the art are the 
5. nucleic acid sequence based amplification (NASBAO, Cangene, Mississauga, Ontario) 
and Q Beta Replicase systems. 

Nucleic acid hybridization simply involves providing a denatured probe 
and target nucleic acid under conditions where the probe and its complementary target 
can form stable hybrid duplexes through complementary base pairing. The nucleic acids 
10 that do not form hybrid duplexes are then washed away leaving the hybridized nucleic 
acids to be detected, typically through detection of an attached detectable label. It is 
generally recognized that nucleic acids are denatured by increasing the temperature or 
decreasing the salt concentration of the buffer containing the nucleic acids, or in the 
addition of chemical agents, or the raising of the pH. Under low stringency conditions 
1 5 (e.g., low temperature and/or high salt and/or high target concentration) hybrid duplexes 
11 (e.g., DNA:DNA, RNA:RNA, or RNA:DNA) will form even where the annealed 

|4 sequences are not perfectly complementary. Thus specificity of hybridization is reduced 

I at lower stringency. Conversely, at higher stringency (e.g., higher temperature or lower 

H salt) successful hybridization requires fewer mismatches. 

di 20 One of skill in the art will appreciate that hybridization conditions may be 

selected to provide any degree of stringency. In a preferred embodiment, hybridization is 
performed at low stringency to ensure hybridization and then subsequent washes are 
performed at higher stringency to eliminate mismatched hybrid duplexes. Successive 
washes may be performed at increasingly higher stringency (e.g. , down to as low as 0.25 

25 X SSPE-T at 37°C to 70°C) until a desired level of hybridization specificity is obtained. 
Stringency can also be increased by addition of agents such as formamide. Hybridization 
specificity may be evaluated by comparison of hybridization to the test probes with 
hybridization to the various controls that can be present. 

In general, there is a tradeoff between hybridization specificity 

30 (stringency) and signal intensity. Thus, in a preferred embodiment, the wash is performed 
at the highest stringency that produces consistent results and that provides a signal • 
intensity greater than approximately 10% of the background intensity. Thus, in a 
preferred embodiment, the hybridized array may be washed at successively higher 
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stringency solutions and read between each wash. Analysis of the data sets thus produced 
will reveal a wash stringency above which the hybridization pattern is not appreciably 
altered and which provides adequate signal for the particular probes of interest. 

Methods of optimizing hybridization conditions are well known to those of 
skill in the art (see, e.g., Tijssen (1993) Laboratory Techniques in Biochemistry and 
Molecular Biology, Vol. 24: Hybridization With Nucleic Acid Probes, Elsevier, N.Y.). 

Labeling and detection of nucleic acids. 

In a preferred embodiment, the hybridized nucleic acids are detected by 
detecting one or more labels attached to the sample or probe nucleic acids. The labels 
may be incorporated by any of a number of means well known to those of skill in the art. 
Means of attaching labels to nucleic acids include, for example nick translation or end- 
labeling (e.g. with a labeled RNA) by kinasing of the nucleic acid and subsequent 
attachment (ligation) of a nucleic acid linker joining the sample nucleic acid to a label 
(e.g., a fluorophore). A wide variety of linkers for the attachment of labels to nucleic 
acids are also known. In addition, intercalating dyes and fluorescent nucleotides can also 
be used. 

Detectable labels suitable for use in the present invention include any 
composition detectable by spectroscopic, photochemical, biochemical, immunochemical, 
electrical, optical or chemical means. Useful labels in the present invention include biotin 
for staining with labeled streptavidin conjugate, magnetic beads (e.g., Dynabeads™), 
fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and 
the like, see, e.g., Molecular Probes, Eugene, Oregon, USA), radiolabels (e.g., 3 H, ,25 I, 
35 S, 14 C, or 32 P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others 
commonly used in an ELISA), and colorimetric labels such as colloidal gold (e.g., gold 
particles in the 40 -80 nm diameter size range scatter green light with high efficiency) or 
colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads. Patents 
teaching the use of such labels include U.S. Patent Nos. 3,817,837; 3,850,752; 3,939,350; 
3,996,345; 4,277,437; 4,275,149; and 4,366,241. 

A fluorescent label is preferred because it provides a very strong signal 
with low background. It is also optically detectable at high resolution and sensitivity 
through a quick scanning procedure. The nucleic acid samples can all be labeled with a 
single label, e.g., a single fluorescent label. Alternatively, in another embodiment, 
different nucleic acid samples can be simultaneously hybridized where each nucleic acid 
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sample has a different label. For instance, one target could have a green fluorescent label 
and a second target could have a red fluorescent label. The scanning step will distinguish 
cites of binding of the red label from those binding the green fluorescent label. Each 
nucleic acid sample (target nucleic acid) can be analyzed independently from one another. 
5 Suitable chromogens which can be employed include those molecules and 

compounds which absorb light in a distinctive range of wavelengths so that a color can be 
observed or, alternatively, which emit light when irradiated with radiation of a particular 
wave length or wave length range, e.g., fluorescers. 

Desirably, fluorescers should absorb light above about 300 nm, preferably 
10 about 350 nm, and more preferably above about 400 nm, usually emitting at wavelengths 
greater than about 10 nm higher than the wavelength of the light absorbed. It should be 
noted that the absorption and emission characteristics of the bound dye can differ from 
the unbound dye. Therefore, when referring to the various wavelength ranges and 
characteristics of the dyes, it is intended to indicate the dyes as employed and not the dye 
15 which is unconjugated and characterized in an arbitrary solvent. 

Fluorescers are generally preferred because by irradiating a fluorescer with 
light, one can obtain a plurality of emissions. Thus, a single label can provide for a 
plurality of measurable events. 

Detectable signal can also be provided by chemiluminescent and 
20 bioluminescent sources. Chemiluminescent sources include a compound which becomes 
electronically excited by a chemical reaction and can then emit light which serves as the 
detectable signal or donates energy to a fluorescent acceptor. Alternatively, luciferins can 
be used in conjunction with luciferase or lucigenins to provide bioluminescence. 
Spin labels are provided by reporter molecules with an unpaired electron spin which can 
25 be detected by electron spin resonance (ESR) spectroscopy. Exemplary spin labels 
include organic free radicals, transitional metal complexes, particularly vanadium, 
copper, iron, and manganese, and the like. Exemplary spin labels include nitroxide free 
radicals. 

■ ■ - - The label may be added to the target (sample) nucleic acid(s) prior to, or 
30 after the hybridization. So called "direct labels" are detectable labels that are directly 

attached to or incorporated into the target (sample) nucleic acid prior to hybridization. In 
contrast, so called "indirect labels" are joined to the hybrid duplex after hybridization. 
Often, the indirect label is attached to a binding moiety that has been attached to the 
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target nucleic acid prior to the hybridization. Thus, for example, the target nucleic acid 
may be biotinylated before the hybridization. After hybridization, an avidin-conjugated 
fluorophore will bind the biotin bearing hybrid duplexes providing a label that is easily 
detected. For a detailed review of methods of labeling nucleic acids and detecting labeled 
hybridized nucleic acids see Laboratory Techniques in Biochemistry and Molecular 
Biology, Vol 24: Hybridization With Nucleic Acid Probes, P. Tijssen, ed. Elsevier, N.Y., 
(1993)). 

Fluorescent labels are easily added during an in vitro transcription 
reaction. Thus, for example, fluorescein labeled UTP and CTP can be incorporated into 
the RNA produced in an in vitro transcription. 

The labels can be attached directly or through a linker moiety. In general, 
the site of label or linker-label attachment is not limited to any specific position. For 
example, a label may be attached to a nucleoside, nucleotide, or analogue thereof at any 
position that does not interfere with detection or hybridization as desired. For example, 
certain Label-ON Reagents from Clontech (Palo Alto, CA) provide for labeling 
interspersed throughout the phosphate backbone of an oligonucleotide and for terminal 
labeling at the 3' and 5' ends. As shown for example herein, labels can be attached at 
positions on the ribose ring or the ribose can be modified and even eliminated as desired. 
The base moieties of useful labeling reagents can include those that are naturally 
occurring or modified in a manner that does not interfere with the purpose to which they 
are put. Modified bases include but are not limited to 7-deaza A and G, 7-deaza-8-aza A 
and G, and other heterocyclic moieties. 

It will be recognized that fluorescent labels are not to be limited to single 
species organic molecules, but include inorganic molecules, multi-molecular mixtures of 
organic and/or inorganic molecules, crystals, heteropolymers, and the like. Thus, for 
example, CdSe-CdS core-shell nanocrystals enclosed in a silica shell can be easily 
derivatized for coupling to a biological molecule (Bruchez et al (1998) Science, 28 1 : 
2013-2016). Similarly, highly fluorescent quantum dots (zinc sulfide-capped cadmium 
"selenide)"have been covalently~c biomolecules fbr useln ultrasensitive " 

biological detection (Warren and Nie (1998) Science, 281: 2016-2018). 

Amplification-based assay s. 

In another embodiment, amplification-based assays can be used to detect 
nucleic acids. In such amplification-based assays, the nucleic acid sequences act as a 
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template in an amplification reaction (e.g. Polymerase Chain Reaction (PCR). Detailed 
protocols for quantitative PCR are provided in Innis et al. (1990) PCR Protocols, A Guide 
to Methods and Applications, Academic Press, Inc. N.Y.). 

Other suitable amplification methods include, but are not limited to ligase 
chain reaction (LCR) {see Wu and Wallace (1989) Genomics 4: 560, Landegren et al. 
(1988) Science 241: 1077, and Barringer et al. (1990) Gene 89: 117, transcription 
amplification (Kwoh et al (1989) Prot. Natl. Acad. Set USA $6: 1 173), and self- . 
sustained sequence replication (Guatelli et al. (1990) Proc. Nat. Acad. Sci. USA 87: 
1874). 

Detection of C. pneumoniae gene expression 

The nuchnc acids of the invention can also be used to C. pneumoniae 
detect gene transcripts. Methods of detecting and/or quantifying gene transcripts using 
nucleic acid hybridization techniques are known to those of skill in the art (see Sambrook 
et al. supra). For example , a Northern transfer may be used for the detection of the 
desired mRNA directly. In brief, the mRNA is isolated from a given cell sample using, 
for example, an acid guanidinium-phenol-chloroform extraction method. The mRNA is 
then electrophoresed to separate the mRNA species and the mRNA is transferred from the 
gel to a nitrocellulose membrane. As with the Southern blots, labeled probes are used to 
identify and/or quantify the target mRNA. 

In another preferred embodiment, the gene transcript can be measured 
using amplification (e.g. PCR) based methods as described above for directly assessing 
copy number of the target sequences. 

Expression of C. pneumoniae proteins 

The nucleic acids disclosed here can be used for recombinant expression 
of the proteins. In these methods, the nucleic acids encoding the proteins of interest are 
introduced into suitable host cells, followed by induction of the cells to produce large 
amounts of the protein. The invention relies on routine techniques in the field of 
fec6m&marit"]je^ 

disclosing the general methods of use in this invention is Sambrook et al, Molecular 
Cloning, A Laboratory Manual (2nd ed. 1989). 

Standard transfection methods are used to produce prokaryotic, • 
mammalian, yeast or insect cell lines which express large quantities of the desired 
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polypeptide, which is then purified using standard techniques {see, e.g., Colley et al, J. 
Biol. Chem. 264:17619-17622, 1989; Guide to Protein Purification, supra). 

The nucleotide sequences used to transfect the host cells can be modified 
to yield Chlamydia polypeptides with a variety of desired properties. For example, the 
5 polypeptides can vary from the naturally-occurring sequence at the primary structure 

level by amino acid, insertions, substitutions, deletions, and the like. These modifications 
can be used in a number of combinations to produce the final modified protein chain. 

The amino acid sequence variants can be prepared with various objectives 
in mind, including facilitating purification and preparation of the recombinant 
1 0 polypeptide. The modified polypeptides are also useful for modifying plasma half life, 
improving therapeutic efficacy, and lessening the severity or occurrence of side effects 
during therapeutic use. The amino acid sequence variants are usually predetermined 
variants not found in nature but exhibit the same immunogenic activity as naturally 
W occurring protein. In general, modifications of the sequences encoding the polypeptides 

1 5 may be readily accomplished by a variety of well-known techniques, such as site-directed 
5 mutagenesis (see Gillman & Smith, Gene 8:81-97 (1979); Roberts et aL, Nature 328:731- 

i" 734 (1987)). One of ordinary skill will appreciate that the effect of many mutations is 

u difficult to predict. Thus, most modifications are evaluated by routine screening in a 

^ suitable assay for the desired characteristic. For instance, the effect of various 

\q 20 modifications on the ability of the polypeptide to elicit a protective immune response can 
^ be easily determined using in vitro assays. For instance, the polypeptides can be tested 

for their ability to induce lymphoproliferation, T cell cytotoxicity, or cytokine production 
using standard techniques. 

The particular procedure used to introduce the genetic material into the 
25 host cell for expression of the polypeptide is not particularly critical. Any of the well 
known procedures for introducing foreign nucleotide sequences into host cells may be 
used. These include the use of calcium phosphate transfection, spheroplasts, 
electroporation, liposomes, microinjection, plasmid vectors, viral vectors and any of the 
other well known methods for introducing cloned genomic DNA, cDNA, synthetic DNA 
30 or other foreign genetic material into a host cell (see Sambrook et aL, supra). It is only 
necessary that the particular procedure utilized be capable of successfully introducing at 
least one gene into the host cell which is capable of expressing the gene. 




hi 
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Any of a number of well known cells and cell lines can be used to express 
the polypeptides of the invention. For instance, prokaryotic cells such as E. coli can be 
used. Eukaryotic cells include, yeast, Chinese hamster ovary (CHO) cells, COS cells, and 
insect cells. 

5 The particular vector used to transport the genetic information into the cell 

is also not particularly critical. Any of the conventional vectors used for expression of , 
recombinant proteins in prokaryotic and eukaryotic cells may be used. Expression 
vectors for mammalian cells typically contain regulatory elements from eukaryotic 
viruses. 

1 o The expression vector typically contains a transcription unit or expression 

cassette that contains all the elements required for the expression of the polypeptide DNA 
in the host cells. A typical expression cassette contains a promoter operably linked to the 
DNA sequence encoding a polypeptide and signals required for efficient polyadenylation 
of the transcript. The term "operably linked" as used herein refers to linkage of a 

1 5 promoter upstream from a DNA sequence such that the promoter mediates transcription 
' of the DNA sequence. The promoter is preferably positioned about the same distance 
from the heterologous transcription start site as it is from the transcription start site in its 
natural setting. As is known in the art, however, some variation in this distance can be 
accommodated without loss of promoter function. 

20 Following the growth of the recombinant cells and expression of the 

polypeptide, the culture medium is harvested for purification of the secreted protein. The 
media are typically clarified by centrifugation or filtration to remove cells and cell debris 
and the proteins are concentrated by adsorption to any suitable resin or by use of 
ammonium sulfate fractionation, polyethylene glycol precipitation, or by ultrafiltration. 

25 Other routine means known in the art may be equally suitable. Further purification of the 
polypeptide can be accomplished by standard techniques, for example, affinity 
chromatography, ion exchange chromatography, sizing chromatography, His 6 tagging and 
Ni-agarose chromatography (as described in Dobeli et aL, Mol. and Biochem. Parasit 
41:259-268 (1990)), or other protein purification techniques to obtain homogeneity. The 

30 purified proteins are then used to produce pharmaceutical compositions, as described 
below. 

An alternative method of preparing recombinant polypeptides useful as 
vaccines involves the use of recombinant viruses (e.g., vaccinia). Vaccinia virus is grown 
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in suitable cultured mammalian cells such as the HeLa S3 spinner cells, as described by 
Mackett et aL, in DNA cloning Vol II: A practical approach, pp. 191-211 (Glover, ed.). 



Antibody Production 

The proteins of the present invention can be used to produce antibodies 
5 specifically reactive with C pneumoniae antigens. If isolated proteins are used, they may 
be recombinantly produced or isolated from Chlamydia cultures. Synthetic peptides 
made using the protein sequences may also be used. 

Methods of production of polyclonal antibodies are known to those of skill 
in the art. In brief, an immunogen, preferably a purified protein, is mixed with an 
10 adjuvant and animals are immunized. When appropriately high titers of antibody to the 
immunogen are obtained, blood is collected from the animal and antisera is prepared. 

O 

% g Further fractionation of the antisera to enrich for antibodies reactive to Chlamydia 

proteins can be done if desired (see Harlow & Lane, Antibodies: A Laboratory Manual 

y 

W (1988)). 

15 Polyclonal antisera are used to identify and characterize Chlamydia in the 

^ tissues of patients using, for instance, in situ techniques and immunoperoxidase test 

jU procedures described in Anderson et al. J A VMA 198:241 (1991) and Barr et al. Vet. 

f2 Pathol 28:110-116(1991). 

H Monoclonal antibodies may be obtained by various techniques familiar to 

k q 20 those skilled in the art. Briefly, spleen cells from an animal immunized with a desired 
antigen are immortalized, commonly by fusion with a myeloma cell (see Kohler & 
Milstein, Eur J, Immunol. 6:51 1-519 (1976)). Alternative methods of immortalization 
include transformation with Epstein Barr Virus, oncogenes, or retroviruses, or other 
methods well known in the art. Colonies arising from single immortalized cells are 
25 screened for production of antibodies of the desired specificity and affinity for the 

antigen, and yield of the monoclonal antibodies produced by such cells may be enhanced 
by various techniques, including injection into the peritoneal cavity of a vertebrate host. 

- Monoclonal-antibodies produced in such a manner are used, for instance, 
in ELISA diagnostic tests, immunoperoxidase tests, immunohistochemical tests, for the in 
30 vitro evaluation of spirochete invasion, to select candidate antigens for vaccine 

development, protein isolation, and for screening genomic and cDNA libraries to select 
appropriate gene sequences. 
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Immunodiagonostic detection of C pneumoniae infections 

The present invention also provides methods for detecting the presence or 
absence of C. pneumoniae, or antibodies reactive with it, in a biological sample. For 
instance, antibodies specifically reactive with Chlamydia can be detected using either 
5 Chlamydia proteins or the isolates described here. The proteins and isolates can also be 
used to raise specific antibodies (either monoclonal or polyclonal) to detect the antigen in 
a sample. In addition, the nucleic acids disclosed and claimed here can be used to detect 
Chlamydia-specific sequences using standard hybridization techniques. 

For a review of immunological and immunoassay procedures in general, 
10 see Basic and Clinical Immunology (Stites & Terr ed., 7th ed. 1991)). The immunoassays 
of the present invention can be performed in any of several configurations, which are 

Q 

„n reviewed extensively in Enzyme Immunoassay (Maggio, ed., 1980); Tijssen, Laboratory 

*^ Techniques in Biochem.stry and Molecular Biology (1985)). For instance, the proteins 

|p and antibodies disclose 1 here are conveniently used in ELISA, immunoblot analysis and 

15 agglutination assays. 
^ In brief, immunoassays to measure anti-Chlamydia antibodies or antigens 

can be either competitive or noncompetitive binding assays. In competitive binding 
assays, the sample analyte (e.g., mti-Chlamydia antibodies) competes with a labeled 
analyte (e.g., snti-Chlamydia monoclonal antibody) for specific binding sites on a capture 
0 20 agent (e.g., isolated Chlamydia protein) bound to a solid surface. The concentration of 
labeled analyte bound to the capture agent is inversely proportional to the amount of free 
analyte present in the sample. 

Noncompetitive assays are typically sandwich assays, in which the sample 
analyte is bound between two analyte-specific binding reagents. One of the binding 
25 agents is used as a capture agent and is bound to a solid surface. The second binding 
agent is labelled and is used to measure or detect the resultant complex by visual or 
instrument means. 

A number of combinations of capture agent and labelled binding agent can 
be used. For instance, an isolated Chlamydia protein or culture can be used as the 
30 capture agent and labelled anti-human antibodies specific for the constant region of 

human antibodies can be used as the labelled binding agent. Goat, sheep and other non- 
iiuman antibodies specific for human immunoglobulin constant regions (e.g., y or \i) are 
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well known in the art. Alternatively, the anti-human antibodies can be the capture agent 
and the antigen can be labelled. 

Various components of the assay, including the antigen, anti-Chlamydia 
antibody, or anti-human antibody, may be bound to a solid surface. Many methods for 
5 immobilizing biomolecules to a variety of solid surfaces are known in the art. For 
instance, the solid surface may be a membrane (e.g., nitrocellulose), a microtiter dish 
(e.g., PVC or polystyrene) or a bead. The desired component may be covalently bound or 
noncovalently attached through nonspecific bonding. 

Alternatively, the immunoassay may be carried out in liquid phase and a 
10 variety of separation methods may be employed to separate the bound labeled component 
from the unbound labelled components. These methods are known to those of skill in the 
q art and include immunoprecipitation, column chromatography, adsorption, addition of 

"pf magnetizable particles coated with a binding agent and other similar procedures, 

y An immunoassay may also be carried out in liquid phase without a 

|i 1 5 separation procedure. Various homogeneous immunoassay methods are now being 

if*™ 

pjj applied to immunoassays for protein analytes. In these methods, the binding of the 

j binding agent to the analyte causes a change in the signal emitted by the label, so that 

U binding may be measured without separating the bound from the unbound labelled 

^ component. 

1 4 

L ,3 20 Western blot (immunoblot) analysis can also be used to detect the presence 

r f"\ 

of antibodies to Chlamydia in the sample. This technique is a reliable method for 
confirming the presence of antibodies against a particular protein in the sample. The 
technique generally comprises separating proteins by gel electrophoresis on the basis of 
molecular weight, transferring the separated proteins to a suitable solid support, (such as a 

25 nitrocellulose filter, a nylon filter, or derivatized nylon filter), and incubating the sample 
with the separated proteins. This causes specific target antibodies present in the sample 
to bind their respective proteins. Target antibodies are then detected using labeled anti- 
human antibodies. 

The immunoassay formats described above employ labelled assay 

30 components. The label may be coupled directly or indirectly to the desired component of 
the assay according to methods well known in the art. A wide variety of labels may be 
used. The component may be labelled by any one of several methods. Traditionally a 
radioactive label incorporating 3 H, I25 1, 35 S, 14 C, or 32 P was used. Non-radioactive labels 
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include ligands which bind to labelled antibodies, fluorophores, chemiluminescent agents, 
enzymes, and antibodies which can serve as specific binding pair members for a labelled 
ligand. The choice of label depends on sensitivity required, ease of conjugation with the 
compound, stability requirements, and available instrumentation. 
5 Enzymes of interest as labels will primarily be hydrolases, particularly 

phosphatases, esterases and glycosidases, or oxidoreductases, particularly peroxidases. 
Fluorescent compounds include fluorescein and its derivatives, rhodamine and its . 
derivatives, dansyl, umbelliferone, etc. Chemiluminescent compounds include luciferin, 
and 2,3-dihydrophthalazinediones, e.g., luminol. For a review of various labelling or 
10 signal producing systems which may be used, see U.S. Patent No. 4,391,904, which is 
incorporated herein by reference. 
O Non-radioactive labels are often attached by indirect means. Generally, a 

ay 

a P ligand molecule (e.g., biotin) is covalently bound to the molecule. The ligand then binds 

|S to an anti-ligand (e.g., streptavidin) molecule which is either inherently detectable or 

V 1 5 covalently bound to a signal system, such as a detectable enzyme, a fluorescent 
If! compound, or a chemiluminescent compound. A number of ligands and anti-ligands can 

L be used. Where a ligand has a natural anti-ligand, for example, biotin, thyroxine, and 

J 4 Cortisol, it can be used in conjunction with the labelled, naturally occurring anti-ligands. 

U Alternatively, any haptenic or antigenic compound can be used in combination with an 

J5 20 antibody. 

Some assay formats do not require the use of labelled components. For 
instance, agglutination assays can be used to detect the presence of the target antibodies. 
In this case, antigen-coated particles are agglutinated by samples comprising the target 
antibodies. In this format, none of the components need be labelled and the presence of 
25 the target antibody is detected by simple visual inspection. 

Pharmaceutical Compositions 

The peptides or antibodies (typically monoclonal antibodies) of the present 

~ i^ 

mammals, particularly humans, to treat and/or prevent Chlamydia infections. Suitable 
30 formulations are found in Remington 's Pharmaceutical Sciences, Mack Publishing 
Company, Philadelphia, PA, 17th ed. (1985). 
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The immunogenic peptides or antibodies of the invention are administered 
prophylactically or to an individual already suffering from the disease. The peptide 
compositions are administered to a patient in an amount sufficient to elicit an effective 
immune response to Chlamydia. An effective immune response is one that inhibits 
5 infection. An amount adequate to accomplish this is defined as "therapeutically effective 
dose" or "immunogenically effective dose." Amounts effective for this use will depend 
on, e.g., the peptide composition, the manner of administration, the stage and severity of 
the disease being treated, the weight and general state of health of the patient, and the 
judgment of the prescribing physician, but generally range for the initial immunization 
10 (that is for therapeutic or prophylactic administration) from about 0.1 mg to about 1.0 mg 
per 70 kilogram patient, more commonly from about 0.5 mg to about 0.75 mg per 70 kg 
of body weight. Boosting dosages are typically from about 0. 1 mg to about 0.5 mg of 
45 peptide using a boosting regimen over weeks to months depending upon the patient's 

:^ response and condition. A suitable protocol would include injection at time 0, 4, 2, 6, 10 

M> 1 5 and 14 weeks, followed by further booster injections at 24 and 28 weeks. 

ffl 

Iff For therapeutic use, administration should begin at the first sign of 

: infection. This is followed by boosting doses until at least symptoms are substantially 

\# abated and for a period thereafter. In some circumstances, loading doses followed by 

boosting doses may be required. The resulting immune response helps to cure or at least 
20 partially arrest symptoms and/or complications. Vaccine compositions containing the 

b.S_| 

peptides are administered prophylactically to a patient susceptible to or otherwise at risk 
of the infection. 

The pharmaceutical compositions (containing either peptides or 
antibodies) are intended for parenteral or oral administration. Preferably, the 

25 pharmaceutical compositions are administered parenterally, e.g., subcutaneously, 
intradermally, or intramuscularly. Thus, the invention provides compositions for 
parenteral administration which comprise a solution of the immunogenic polypeptides 
^dissolved or suspended in an acceptable carrier, preferably an aqueous carrier. A variety 
of aqueous carriers may be used, e.g., water, buffered water, 0.4% saline, 0.3% glycine, 

30 hyaluronic acid and the like. These compositions may be sterilized by conventional, well 
known sterilization techniques, or may be sterile filtered. The resulting aqueous solutions 
may be packaged for use as is, or lyophilized, the lyophilized preparation being combined 
with a sterile solution prior to administration. The compositions may contain 
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pharmaceutically acceptable auxiliary substances as required to approximate 
physiological conditions, such as buffering agents, tonicity adjusting agents, wetting 
agents and the like, for example, sodium acetate, sodium lactate, sodium chloride, 
potassium chloride, calcium chloride, sorbitan monolaurate, triethanolamine oleate, etc. 
5 The compositions may also comprise carriers to enhance the immune 

response. Useful carriers are well known in the art, and include, e.g., KLH, 
thyroglobulin, albumins such as human serum albumin, tetanus toxoid, polyamino acids 
such as poly(lysine:glutamic acid), influenza, hepatitis B virus core protein, hepatitis B 
virus recombinant vaccine and the like. 
10 For solid compositions, conventional nontoxic solid carriers may be used 

which include, for example, pharmaceutical grades of mannitol, lactose, starch, 
magnesium stearate, sodium saccharin, talcum, cellulose, glucose, sucrose, magnesium 
carbonate, and the like. For oral administration, a pharmaceutically acceptable nontoxic 

Hjf composition is formed \ y incorporating any of the normally employed excipients, such as 

III 

H> 15 those carriers previously listed, and generally 10-95% of active ingredient, that is, one or 

lp 

IP more peptides of the invention, and more preferably at a concentration of 25%-75%. 

J As noted above, the peptide compositions are intended to induce an 

[4 immune response to Chlamydia. Thus, compositions and methods of administration 

[ A suitable for maximizing the immune response are preferred. For instance, peptides may 

20 be introduced into a host, including humans, linked to a carrier or as a homppolymer or 

if! 

heteropolymer of active peptide units from various Chlamydia proteins disclosed here. 
Alternatively, a "cocktail" of polypeptides can be used. A mixture of more than one 
polypeptide has the advantage of increased immunological reaction and, where different 
peptides are used to make up the polymer, the additional ability to induce antibodies to a 

25 number of epitopes. 

The compositions also include an adjuvant. As used here, number of 
adjuvants are well known to one skilled in the art. Suitable adjuvants include incomplete 
Freund's adjuvant, alum, aluminum phosphate, aluminum hydroxide, 
N-acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP), 

30 N-acetyl-nor-muramyl-L-alanyl-D-isoglutamine (CGP 1 1637, referred to as nor-MDP), 
N-acetylmuramyl-Lalanyl-D-isoglutaminyl-L-alanine-2-(r-2'-dipalmitoyl-sn- 
giycero-3-hydroxyphosphoryloxy)-ethylamine (CGP 19835A, referred to as MTP-PE), 
and RIBI, which contains three components extracted from bacteria, monophosphoryl 
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lipid A, trehalose dimycolate and cell wall skeleton (MPL+TDM+CWS) in a 2% 
squalene/Tween 80 emulsion. The effectiveness of an adjuvant may be determined by 
measuring the amount of antibodies directed against the immunogenic peptide. 

The concentration of immunogenic peptides of the invention in the 
5 pharmaceutical formulations can vary widely, i.e. from less than about 0.1%, usually at 
or at least about 2% to as much as 20% to 50% or more by weight, and will be selected 
primarily by fluid volumes, viscosities, etc., in accordance with the particular mode of 
administration selected. 

The peptides of the invention can also be expressed by attenuated viral 
10 hosts, such as vaccinia or fowlpox. This approach involves the use of vaccinia virus as a 
vector to express nucleotide sequences that encode the peptides of the invention. Upon 
O introduction into a host, the recombinant vaccinia virus expresses the immunogenic 

peptide, and thereby elicits an immune response. Vaccinia vectors and methods useful in 
immunization protocols are described in, e.g., U.S. Patent No. 4,722,848. Another vector 
15 is BCG (Bacille Calmette Guerin). BCG vectors are described in Stover et al. {Nature 
351 :456-460 (1991)). A wide variety of other vectors useful for therapeutic 
administration or immunization of the peptides of the invention, e.g., Salmonella typhi 
vectors and the like, will be apparent to those skilled in the art from the description 
herein. 

i! 

*8 20 The DNA encoding one or more of the peptides of the invention can also 

-Li 

be administered to the patient. This approach is described, for instance, in Wolff et al, 
Science 247: 1465-1468 (1990) as well as U.S. Patent Nos. 5,580,859 and 5,589,466. 

In order to enhance serum half-life, the peptides may also be encapsulated, 
introduced into the lumen of liposomes, prepared as a colloid, or other conventional 
25 techniques may be employed which provide an extended serum half-life of the peptides. 
A variety of methods are available for preparing liposomes, as described in, e.g., Szoka et 
al„ Ann. Rev. Biophys. Bioeng. 9:467 (1980), U.S. Pat. Nos. 4, 235,871, 4,501,728 and 
4,837,028. 

30 EXAMPLES 

The following examples are offered to illustrate, but no to limit the 
claimed invention. 
Example 1 : 
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This example describes comparison of the C pneumoniae genome 
disclosed here and the, previously sequenced, C. trachomatis genome (Stephens, et al. 
Science 282:754-759 (1998)). 

The apparent low level of DNA homology between C. trachomatis and C. 
pneumoniae (Campbell, et al, J. Clin. Microbiol 25:191 1-1916 (1987)) yet analogous 
cell structures and developmental cycles, predicts that comparative analysis of the two 
genomes will significantly enhance the understanding of both pathogens. Identification 
of genes that are present in one species but not the other are of particular importance for 
the mutually exclusive biological, virulence and pathogenesis capabilities of each. 
Identification of genes shared between the two species strongly supports the requirement 
for these capabilities in a biological system that has, over its long-term association with 
mammalian host cells, evolved to reduce the metabolic capacities while optimizing 
survival, growth and transmission of these unique pathogens. 

The previously sequenced C. trachomatis genome contains 1,042,519 
nucleotides and 875 likely protein-coding genes. Similarity searching permitted the 
inferred functional assignment of sequences 636 (60%) genes disclosed here and 251 
(23%) are similar to hypothetical genes for other bacterial organisms including those for 
C trachomatis. The remaining 186 (17%) genes are not homologous to sequences 
deposited in GenBank.. Seventy C trachomatis genes are not represented in the C 
pneumoniae genome. These are contained within blocks consisting of 2-17 genes and 19 
single genes. Of the 70 C trachomatis genes without homologs in C pneumoniae, 60 are 
classified as encoding hypothetical proteins. The remaining genes not represented in C 
pneumoniae consist of the tryptophan operon (trpA t B,R\ trpC, two predicted thiol 
protease genes, and 4 genes assigned to the phospholipase-D superfamily. 

It is evident that there is a high level of functional conservation between C. 
pneumoniae and C. trachomatis as orthologs to C. trachomatis genes were identified for 
859 (80%) of the predicted coding sequences for C. pneumoniae. The level of similarity 
for individual encoded proteins spans a wide spectrum (22-95% amino acid identity) with 
an average of 62% amino acid identity between orthologs from the two species. The 
percent amino acid identity between orthologous chlamydial proteins is similar among 
functional groups with the highest for proteins associated with translation and the lowest 
for proteins whose function in chlamydiae is uncharacterized and not related to proteins 
encoded by other organisms. The gene order of the homologous set of genes in C. 
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pneumoniae shows reorganization relative to the genome of C trachomatis; however, 
there is a high level of synteny for the gene organization of the two genomes. We 
identified thirty-nine blocks of 2 or more genes whose gene organization is colinear with 
homologs to C. trachomatis, although some of these are inverted. The distribution of 
genome reorganization is not evenly distributed on the chromosome as the region 
between C pneumoniae coding sequences 0130-0300 contains substantially more 
reorganization than other areas of the genome. This region coincides with the predicted 
chromosome replication terminus. 

We identified orthologs of enzymes characterized in other bacteria that 
account for the essential requirements for DNA replication, repair, transcription and 
translation including two predicted DNA helicases of the Swi2/Snf2 family found in C. 
trachomatis. Similar to C. trachomatis, alternative sigma subunits for RNA polymerase, 
a 28 and a 54 , were identified in addition to anti-a regulatory system factors RsbV, a 
RsbW-like single-domain histidine kinase, and a RsbU-like protein phosphatase. These 
findings suggest that the fundamental mechanisms of transcriptional regulation are 
conserved among Chlamydia. The C. trachomatis proteins containing SET and SWTB 
domains, and a SWIB domain fused to the C-terminus of the chlamydial topoisomerase I, 
not identified outside eukaryotes, are found in C. pneumoniae supporting their possible 
role in the chromatin condensation-decondensation characteristic of the biologically 
unique chlamydial developmental cycle. 

The central metabolic pathways inferred from the C pneumoniae genome 
sequence are the same as those identified for C. trachomatis C. pneumoniae has a 
glycolytic pathway and a linked tricarboxylic acid cycle, although likely functional, is 
incomplete as genes for citrate synthase, aconitase, and isocitrate dehydrogenase were not 
identified. C. pneumoniae has a complete glycogen synthesis and degradation system 
supporting a role for glycogen synthesis and utilization of glucose-derivatives in 
chlamydial metabolism. Genes encoding essential functions in aerobic respiration are 
present and electron flux may be supported by pyruvate, succinate, glycerol-3-phosphate, 
and NADH dehydrogenases, NADH-ubiquinone oxidoreductase and cytochrome oxidase. 
C. pneumoniae also contains the V (vacuolar)-type ATPase operon and the two ATP 
translocases found in C. trachomatis. 

The type-Ill secretion virulence system required for invasion by several 
pathogenic bacteria and found in the C. trachomatis genome in three chromosomal 
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locationsis also present in the C pneumoniae genome. Each of the components is 
conserved and their relative genomic contexts are conserved. Genes such as a predicted 
serine/threonine protein kinase and other genes physically linked to genes encoding 
structural components of the type-Ill secretion apparatus, but without identified 
5 homologs, are also highly similar between the two species suggesting the functional roles 
in modifying cellular biology are fundamentally conserved. 

Chlamydia-encoded proteins that are not found in chlamydial organisms 
but localized to the intracellular chlamydial inclusion membrane are likely essential for 
the unique intracellular biology and perhaps differences in inclusion morphology 
10 observed between species of Chlamydia. Several such proteins, termed IncA,B&C, have 
been characterized for a C. psittaci strain (Rockey, et al Mol Microbiol 15:617-626 

llj 

v| (1995); Rockey et al Infect Immun. 62:106-1 12 (1994)). C. pneumoniae and C. 

7\ trachomatis encode orthologs to C. psittaci IncB and IncC and C trachomatis also 

jB contains an ortholog to Lie A. C pneumoniae contains two genes that encode proteins 

|n 15 with similarity to IncA (CPn0186 and CPn0585), although the level of homology is low 
I* s suggesting analogous but possibily altered functions. 

p The tryptophan biosynthesis operon (trpA, trpB, trpR) and trpC identified 

y, in C. trachomatis is conspicuously missing in the C pneumoniae genome. This 

represents the entire repertoire of genes associated with tryptophan biosynthesis identified 
20 in C. trachomatis. Seventeen genes adjacent to the C. trachomatis tryptophan operon also 
were not found in the C pneumoniae genome. This region is the single largest loss of a 
contiguous genomic segment and includes 4 HKD superfamily encoding genes that 
encompass a family of proteins related to endonuclease and phospholipase D. These 
findings may be important for the ability of Chlamydia to persist in their hosts and cause 
25 disease by eliciting potent, focal and persistent inflammatory responses thought to be 
essential for pathogenesis. 

The C pneumoniae genome contains 187,71 1 additional nucleotides 
compared to the C trachomatis genome, and the 214 coding sequences not found in C. 
trachomatis account for most of the increased genome size. Eighty-eight of these genes 
30 are found in blocks of >10 genes (1 1-30 genes/block), 41 are single genes, and the 
remainder are partnered with at least one other gene. Based upon the observation that 
-70% of all the C. pneumoniae genes have an identifiable homolog in GenBank, 
exclusive of C trachomatis, it would be expected that over 150 of the 214 genes should 
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have a homolog in GenBank, many associated with a function. However, only 28 coding 
sequences have similarity to genes from other organisms. Thus the majority of the genes 
that are mutually exclusive of C trachomatis (186 of 214), and the 60 of 70 C. 
trachomatis genes that lacked an identifiable homolog in C pneumoniae, do not have 
detectable homologs to genes from other organisms. We predict that most of the unique 
genes are essential for specific attributes that define the differential biology, tropism and 
pathogenesis of C trachomatis and C pneumoniae. Moreover, this suggests that C. 
pneumoniae has more unique biological (i.e., virulence) capacity than C. trachomatis. 
The ability of C. pneumoniae to be more invasive and survive in a broader range of host 
cell types than C. trachomatis is consistent with this hypothesis. Not all of the 
differences in biological capacity may be associated with mutually exclusive genes. One 
explanation for the significantly lower level of homology between protein" sequences 
assigned as having C. pneumoniae and C. trachomatis orthologs but no identifiable 
orthologs in other organisms is that this set of proteins is not only associated with 
biological requirements specific for Chlamydia but this polymorphism may account for 
differential biology between the two species. The determination of the genome sequence 
from a representative of the C psittaci group will precisely delineate those genes that are 
mutually exclusive and specific for each species. 

The major functionally identifiable addition to the C pneumoniae genome 
is a large expansion of genes encoding a new family of chlamydial polymorphic 
membrane proteins (Pmp), alone representing 22% of the increased coding capacity. 
While the C. trachomatis genome has 9 pmp genes, remarkably the C. pneumoniae 
genome contains 21 pmp genes. Most of these genes appear to be amplified in two 
regions of the genome with three stand-alone genes. Interestingly one of the stand-alone 
genes is most closely related to the C trachomatis pmpD which is the only stand-alone 
pmp gene in the C. trachomatis genome and it is located with the same relative genomic 
context, suggesting an essential and conserved function for this paralog. Six Pmp-coding 
genes are presumably not functional as five contain predicted coding frame-shifts and one 
is truncated. The amplification of this gene family and the confidently predicted frame- 
shifts suggest a specific molecular mechanism to promote functional or antigenic 
diversity. The biological role of this protein family remains enigmatic, although at least 
one of the proteins in C. psittaci related to this family is exposed on the chlamydial 
surface. 
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While a function could not be assigned for most of the unique C 
pneumoniae genes, several have significant similarity to genes from other organisms. 
Functional assignments could be made for genes encoding GMP synthetase, IMP 
dehydrogenase, UMP synthase, uridine kinase, biotin synthase pathway proteins, 

5 methylthioadenosine nucleosidase, a DNA glycosylase and aromatic amino acid 
hydroxylase. Thus a complete pathway was identified for biotin biosynthesis. The 
additional purine and pyrimidine salvage pathway genes presumably reflect metabolic 
limitations in one of the cell types that C pneumoniae infects or differences in the ability 
of C. pneumoniae to transport precursor nucleosides or nucleotides. 

10 The addition of aromatic amino acid hydroxylase in C. pneumoniae is 

intriguing especially in light of the loss of tryptophan biosynthetic genes and the inability 
to synthesize other amino acids including phenylalanine. Aromatic amino acid 
hyroxlyases include three distinct enzymes that function to receptively oxidize 
phenylalanine to tyrosine, tyrosine to Dopa, and tryptophan to 5-hydroxytryptophan and 

1 5 serotonin. Although the chlamydial protein is similar to proteins of this family and 

* incrementally more closely related to tryptophan hydroxylase, its specific function could 
not be confidently predicted. We hypothesize that it may be involved in C pneumoniae 
virulence. Tryptophan hydroxylase has not been previously identified in bacteria and the 
origin of the chlamydial gene appears to be from eukaryotes. The functional role of an 

20 aromatic amino acid hydroxylase for C pneumoniae is linked to the unique intracellular 
biology of this organism and may represent a key contribution to C. pneumoniae 
persistence and pathogenesis. 

It is understood that the examples and embodiments described herein are 
for illustrative purposes only and that various modifications or changes in light thereof 

25 will be suggested to persons skilled in the art and are to be included within the spirit and 
purview of this application and scope of the appended claims. All publications, patents, 
and patent applications cited herein are hereby incorporated by reference in their entirety 
for all purposes. 

Table 1 provides functional assignments of C. pneumoniae nonprotein- 
30 encoding genomic sequences. Table 2 provides functional assignments of protein coding 
sequences. Table 3 provides the amino acfti sequences of the proteins corresponding to 
the coding sequences. / 
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TABLE 1 



type SEQIDN0:1 
start position 



Ori 


841664 


tmRNA 


138493 


pRNA 


607342 


rRNA 


1000564 


rRNA 


1002415 


rRNA 


1005393 


tRNA 


269070 


tRNA 


164318 


tRNA 


296224 


tRNA 


836191 


tRNA 


1030533 


tRNA 


784896 


tRNA 


781680 


tRNA 


961536 


tRNA 


999949 


tRNA 


268992 


tRNA 


672236 


tRNA 


680178 


tRNA 


715889 


tRNA 


739403 


tRNA 


1175863 


tRNA 


784994 


tRNA 


843926. 


tRNA 


409922 


tRNA 


631373 


tRNA 


677337 


tRNA 


807413 


tRNA 


877473 


tRNA • 


462141 


tRNA 


1085605 


tRNA 


786780 


tRNA 


89728 


tRNA 


293477 


tRNA 


87522 


tRNA 


199301 


tRNA 


199390 


tRNA 


626904 


tRNA 


708359 


tRNA 


1142034 


tRNA 


1230028 


tRNA 


91070^ 


tRNA 


293399 


tRNA 


296147 


tRNA 


1137389 



SEQIDNO:1 Gene 
end position 

841396 (R) Putative Origin of Replica 

138074 (R) tmRNA 

607649 Ribonuclease P RNA 

1002115 16S rRNA 

1005278 23S rRNA 

1005509 5S rRNA 

269142 Ala tRNA_l 

164389 Asn tRNA 

296151 (R) Asp tRNA 

836119 (R) Ala tRNA_2 

1030603 Cys tRNA 

784822 (R) Glu tRNA | 

781610 (R) Gly tRNA^l ! 

961607 Gly tRNA_2 

1000023 His tRNA 

269065 He tRNA 

672318 Leu tRNA_l 

680257 Leu tRNA_2 

715971 Leu tRNA_3 

73 9486 Leu tRNA_4 

1175944 Leu tRNA_5 

784922 (R) Lys tRNA 

843 999 Pro tRNA_2 

409848 (R) Pro tRNA_l 

631445 Phe tRNA 

677264 (R) Arg tRNA_2 

807341 (R) Arg tRNA_3 

877400 (R) Arg tRNA.4 

462214 Arg tRNA_l 

10.85676 Gin tRNA 

786708 (R) Thr tRNA_3 

89657 (R) Thr tRNA_l 

293405 (R) Thr tRNA_2 

87450 (R) Met tRNA_l 

199229 (R) Met tRNA_2 

199317 (R) Met tRNA_3 

626987 Ser tRNA.l 

708440 Ser tRNA_2 

1142117 Ser tRNA_3 

1229945 (R) Ser tRNA_4 

90 9 99 (R) Trp tRNA :: ' 

293317 (R) Tyr tRNA 

296075 (R) VaL tRNA.l 

1137462 VaL tRNA_2 
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Pint t. 



rram 



CPnOOOl 
CPnO002 
CPn0003 
riuioo 1 
CPnUUUS 
CPn0006 
CPn0007 
CPn0008 
CPn0009 
CPnOOlO 
CPnOOlO 
CPnOOll 
CPn0012 
CPnOOlS 
CPn0014 
CPnOOlS 
CPnOOlS 
CPn0017 
CP@0L8 
CPt*Q019 
CPtMo20 
CPT&021 
CPW022 

CP fi 023 

CPnJ)024 

c page 2 5 

CPnlo26 
CP&B027 
CPn0028 
CPH0029 
CP&003O 
CpMo31 
CPfe*032 
CP&§,033 



282 

573 

895 

J no 

4127 

7293 

7605 

10975 

11815 

13435 

14379 

15892 

16644 

18584 

21392 

21335 

24416 

26094 

27522 

29007 

32687 

34410 

34982. 

36603 

37596 

38604 

39625 

42234 

43325 

43755 

43891 

44711 

44923 

46138 



4 

875 
2370 
: i : 
od'JZ 
7141 
10496 
11685 
13119 
14325 
15746 
16614 
18212 
21106 
21922 
24174 
26188 
27170 
29003 
30356 
30603 
32707 
34395 
35014 
36661 
37684 
38762 
39778 
42543 
43390 
44529 
44884 
46098 
48171 



R 
F 

F 
L- 
F 
R 
F 
F 
F 
P 
F 
F 
F 
F 
F 

r 

Y 
Y 
F 
F 
F. 
P 
P 
P 
F 
R 
R 
R 
R 
R 
F 
F 
F 
F 



. CPn|?034 


49457 


48210 


R 


CPMJ03 5 


51029 


49569 


R 


CPn0036 


51002 


51796 


F 


CPn0037 


51792 


52115 


F 


CPn0038 


52119 


53831 


F 


CPn0039 


54250 


53963 


R 


CPn0040 


55643 


54318 


R 


CPn0041 


55996 


57342 


F 


CPn0042 


57403 


58182 


F 


CPn0043 


58447 


60372 


F 


CPn0044 


60419 


60778 


F 


CPn0045 


61069 


62790 


F 


CPn0046 


62790 


63263 


F 


CPn0047 


63455 


63652 


F 


CPn0048 


63687 


65801 


F 


CPn0049 


66296 


65817 


R 


CPnOOSO 


66813 


66499 


R 


CPnOOSl 


66833 


67111 


F 


— ePnO.05.2— 


68005 


_ 67.3.0.4. 


R 










CPn0053 


69344 


67986 


r" 


CPn0054 


70023 


69313 


R 


CPn0055 


70129 


70590 


F 


CPnO056 


70953 


72746 


F 


CPn0057 


72934 


73554 


F 


CPn0058 


73639 


74562 


F 


CPn0059 


: 4616 


75050 


F 


CPn0060 


7*5055 


75528 


F 


CPn006 L 


75534 


76208 


F 


CPn0062 


76308 


77690 


F 


rpn0063 


7fUl2 


78:67 


F 


CPnO064 


7*346 


78576 


F 


TPnO065 


7*9-4 


8065 L 


F 


CPnOQ66 


H09J5 


H265S 


F 



TABLE 2 

1nn n ,.,^r^n ^r.., nfhmmT In nnrtnthtm) 

CTOOi hypothetical protein , 
Stc-Clu-tlWA Gin Amidotransferas. (C subunit) -(CT002) 
gatA-GLu cRNA Gin Amidotrans f erae- (CTOO J ) 

pmp_l- Polymorphic Outer NhDnh Protein G Family 



frame-shift with 0010 

pmp 2-Polymorphic Outer Membrane Protein G Family 

pmp J -Polymorphic Outer Membrane Protein G Family 

pmp~3-PMP_3 (frame-shift with 0014) 

pmp 4- Polymorphic Outer Membrane Protein G Family 

D mp~4-PMP 4 (frame-shift with 0016) 

pmpIs-Polymorphic Outer Membrane Protein G Family 

r»nr> S-PMP 5 (frame-shift with 0018) , 

ttc™ *i tle^r (14) peptide: outer ^brane] - (CT3 51) 

Predicted OMP [leader (19) peptide] - (CT350 ) 

ma£-{CT349) ._., Jfl , 
yjjK/alr-ABC Transporter Protein ATPase- (CT348) 
xerC-Integrase/recoinbinase-{CT347) —,.,5. 
eLc/atsA-Sulphohydrolase/Glycosulfata-e-t CT346J 
CT345 hypothetical protein- (CT3 45) 
lon-Lon ATP-dependent Protease- (CT344) 

gep 1-O-Sialoglycoprotein Endopeptidase.l- (CT343) 
rs21-S21 Ribosomal Protein- (CT342 ) 

Fusion- (CT340) 

CT339 hypothetical protein 
CT338 hypothetical protein ,—,,,1 
ptsH-PTS Phosphocarrier Protein Hpr-(CT337) 
ptsI-PTS PEP Phosphotransferase- {CT336) 

ybaB-(CT335) ,n, 
dnaX_l-DNA Pol III Gamma and Tau.l- (CT334 ) 



•yqfF-Bs conserved hypothetical IM protein 



heme -Porphobilinogen Deaminase- (CT2 99 ) 

' sms-Sms Protein- (CT298)- — "-. " - - - 

rnc-Ribonuclease III-(CT297) 
CT296 hypothetical protein 
mrsA-Phosphomannomuta3e-(CT295) 
sodM-Superoxide Oismutase (cr292) 
accD-AcCoA carboxylase/Transferase Beta-(CT293) 
dut-dUTP Nucleotidohydrolase-(CT292) 
ocsN L-PTS IIA Protein- (CT29 L ) 

ptsK-PTS IIA Protein *■ HTH DMA-Binding Domain- (CT290) 
or: 3 9 hypothetical protein 



CT288 hypocher. LmI pcor.-Ln 
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CPn0067 

CPn0068 

CPn0069 

CPn0070 

CPnO071 

CPn0072 

CPnQ073 ' 

CPn0074 

CPn0075 

CPn0076 

CPn0077 

CPn0078 

CPn0079 

CPn0080 

CPnOOSl 

CPn0082 

CPn0083 

CPn0084 

CPn0085 

CPn0086 

CPn0087 

CPn0088 

C|g)089 

CPjgQ090 

CPlb091 

Cftip092 

CBM)093 

CP^D094 

C|np095 

c5aQ096 

C&M)097 
cl&098 
CBn0099 
CfnOlOO 
C$n01Ol 
cM'0102 
c|nO103 

CPppi05 
CP%(bl06 
Cpfi)107 
CPn0108 
CPn0109 
CPnOllO 
CPnOlll 
CPn0112 
CPn0113 
CPn0114 
CPn0115 
CPnOll-6 
CPn0117 
CPn0118 
CPn0119 
CPn0120 
CPn0121 
CPn0122 
- ^,CPn0123-: 
CPn0124 
CPn0125 
CPn0126 
CPn0127 
CPn0128 
CPn0129 
CPn0130 
CPn0131 
CPn0132 
CPn0133 
CPn0134 
CPn0135 
CPn0136 
C?n0137 
CPn0l38 



82953 
84903 
85236 
87378 
88045 
89061 
89356 
89774 
91102 
91358 
92013 
92465 
93179 
93735 
94261 
98043 
102332 
103362 
104506 
104904 
105579 
106373 
108153 
109454 
110074 
112151 
112509 
113152 
116037 
124314 
124555 
127491 
127593 
129141 
129932 
130123 
131480 
133875 
134847 
135091 
137162 
137857 
138655 
143734 
144686 
144767 
145335 
146398 
147279 
148616 
148989 
150102 
150523 
151164 
151778 
152071 
-^1559_69- ; 
156614 
158096 
158809 
162143 
162277 
163717 
164245 
164549 
165587 
167334 
169098 
169448 
171401 
172254 
174019 



84053 
84331 
87086 
87208 
87599 
88057 
89574 
90955 
91350 
91903 
92435 
93160 
93688 
94121 
98016 
102221 
103312 
103751 
103766 
105527 
106376 
108145 
109466 
110080 
112053 
112573 
113015 
115971 
118790 
118837 
126006 
126091 
127865 
127882 
129141 
131466 
132511 
132676 
134029 
136374 
136392 
137303 
141783 
141827 
143934 
145093 
146405 
147261 
148622 
148972 
150071 
150464 
151164 
151778 
152068 
153723 
, Tt _ r 153.7_7.4.- 
158068 
158605 
161085 
161130 
163053 
163064 
163751 
165580 
166561 
166564 
167467 
169143 
169569 
171502 
172700 



F 

R 

F 

R 

R 

R 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

R 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

R 

F 

R 

F 

R 

R 

F 

F 

R * 

R 

F 

R 

R 

F 

R 

R 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

— R— 
F 
F 
F 
R 
F 
R 
R 
F 
F 
R 
R 
R 
R 
R 
R 



CT360 hypothetical protein 



CT325 hypothetical protein 
CT324 hypothetical protein 
infA-Initiation Factor IF-1-ICT323) 
tufA-Elongation Factor Tu-(CT322) 
secE-preprotein translocase- (CT321) 
nusG-Transcriptional Antitermination- (CT320) 
rlll-Lll Ribosomal Protein- (CT3 19 ) 
rll-Ll Ribosomal Protein- (CT318 ) 
rllO-LlO Ribosomal Protein- {CT3 17 } 
rl7-L7/Ll2 Ribosomal Protein- <CT316) 
rpoB-RNA Polymerase Beta- (CT31 5) 
rpoC-RNA polymerase Beta* -(CT314) 
tal-Transaldolase- (CT313) 
predicted f erredoxin- (CT312 > 
CT311 hypothetical protein 
atpE-ATP Synthase Subunit E-(CT310) 
CT309 hypothetical protein 
atpA-ATP Synthase Subunit A-(CT308) 
atpB-ATP Synthase Subunit B-(CT307) 
atpD-ATP Synthase Subunit D-(CT306) 
atpI-ATP Synthase Subunit I-(CT305) 
atpK-ATP Synthase Subunit K-(CT304) 
CT303 hypothetical protein 
valS-Valyl tRNA Synthetase- (CT302 ) 
plcnD-S/T Protein Kinase- {CT3 01) 
uvrA-Excinuc lease ABC Subunit A-(CT333) 
pyk-Pyruvate Kinase- (CT332) 
htrB-Acyltransf erase- (CT010) 

CT011 hypothetical protein 

ybbP family hypothetical protein- (CT012 ) 

cydA-Cytochrome Oxidase Subunit I-(CT013) 

cydB -Cytochrome Oxidase Subunit II-{CT014) 

CT017 hypothetical protein 

CT016 hypothetical protein 

phoH-ATPase-(CT015) 

CT058 hypothetical protein_l 

CT018 

ileS-Isoleucyl-tRNA Synthetase- (CT019) 

lepB-Signal Peptidase I-(CT020) 

CT021 hypothetical protein 

rl31-L31 Ribosomal Protein- (CT022 ) 

pfrA-Peptide Chain Releasing Factor (RF-1) - (CT023) 

hemK-A/G specific methylase- (CT024 ) 

ffh-Signal Recognition Particle GTPase- (CT025) 

rsl6-Sl6 Ribosomal Protein- (CT026) 

trmD-tRKA (guanine N-l) -Methyltransf erase- (CT027 ) 

rll9-L19 Ribosomal Protein- (CT028) 

rnhB_l-Ribonuclease HII_1- (CT029 J 

gmk-GMP Kinase- (CT030) 

CT031 hypothetical protein 

metG-Methionyl-tRNA Synthetase- (CT032 ) ,~ ft «i 
recD i-Exodeoxyri bonucleas e_V (Alp ha Subun i t (CT033 ) 



ytfF-Cationic Amino Acid Transporter- (CT03 4) 
bpll-Biotin Protein Ligase- (CT035) 
similarity to CT036 



CKLPS hypothetical protein- (CT109 ) 
groEL.l -HSP- 60_1 - (CT110) 
groES-lOKDa Chaperonin- (CTlll ) 
pepF-Oligopeptidase- (CT112) 
ybgl-ACR family- (CT108 ) 

hemL-Glutairate-l-semialdehyde-2,l-aminomutaBe-(CT210) 
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CPnO 13 9 


174656 


174093 


R 


yqgE- (CT210) 


ronfl 1 A 0 


175110 


174673 


R 


yqdE- (CT212) 


CPn014 1 


175802 


175110 


R 


rpiA-Ribose-5-P Isomerase A-{CT213) 


f\i A*y 


176091 


175816 


R 




L. PR U 1 4 J 


177335 


176214 


R 


*yxjG_Bs_l Hypothetical Protein 


c r n u i *i h 


177 963 


180560 


F 


clpB-Clp Protease ATPase- {CT113 ) 




180777 


182369 


F 


CT114 hypothetical protein 


rpnfll Aft 


182613 


183095 


F 






183225 


183671 


F 




f Dn n- 1 A ft 


183846 


185702 


F 


pknl-S/T Protein Kinase- {CT14 5) 


f On fl 1 A 0 


185715 


187700 


F 


dnlJ-DNA Ligase- (CT146) 


CPnOlSO 




192444 


F 


CT147 hypothetical protein 


CPnoiDi 




192625 


R 


mhpA-Monooxygenase- (CT148 ) 


CPn0152 


195265 


194318 


R 


CT149 hypothetical protein 


~CPn0153 


195433 


197892 


F 


leuS-Leucyl tRNA Synthetase- (CT20 9) 


CPn0154 


197892 


199202 


F 


gseA-KDO Trans f erase- ( CT208 ) 


CPn0155 


199691 


199488 


R 




CPn0156 


200117 


199770 


R 




CPn0157 


200723 


200298 


R 




CPn0158 


201430 


200894 


R 




CPn0159 


201772 


201467 


R 




ronfll £ (1 
* x l u lou 


203791 


202127 


R 


pfkA_l-Fructose-6-P Phosphotransf erase_l-.(CT207) 




204622 


203798 


R 


predicted acyl transferase family- (CT206 J 


E.—J! 


\J -J fj +t V 


204803 


R 




CPrigL63 


206026 


206394 


F 




urii v * u 


206498 


206998 


F 




rprifl-lfis 


206998 


207582 


p 




CPif0166 

ppn iTl 67 
L fTTw-1 Q 7 


207630 


207962 


F 




2083 06 

A \J O J w u 


207977 


R 




208641 


208417 


R 




209501 


208710 


R 




iw l i iy tii. ' w 


211026 


210025 


R 




CPn0171 


212435 


211149 


R 


*guaA-GMP Synthase 


CPn0172 


213177 


212440 


R 


•guaB/impD-Inosine 5 ' -monophosphase dehydrogenase (C00H- 








only) 


C PnOl 7 3 


213987 


213715 






CPt$Q174 


2 142 57 


214724 


F 




C Pn 01 7 5 


214898 


215275 


F* 






215286 


216518 


F 


CT153 hypothetical protein 


CPrf&177 


217459 


216608 


R 




C PnQ i 7 9 


218052 


217789 


R 




CPn0179 


218403 


218056 


R 




CPnOiaO 


218851 


218355 


R 






21917 5 


218777 


R 




CPn0182 


220695 


219334 


R 


accC-Biotin Carboxylase- (CT124) 


CPn0183 


221195 


220695 


R 


accB-Biotin Carboxyl Carrier Protein- (CT12 3) 


CPn0184 


221775 


221221 


R 


efp_l- Elongation Factor P_1-(CT122) 


CPn0185 


222451 


221765 


R 


rpe/araD-Ribulose-P Epimerase- (CT121 ) 


CPn0186 


222899 


224068 


F 


•similarity to Cps IncA^_ 1- (CT119) 


CPn0187 


224248 


225045 


F 


predicted methylase- (CT133) 


CPn0188 


225111 


226400 


F 


CT132 hypothetical protein 


CPn0189 


226400 


229825 


• f 


CT131 homolog- ( Possible Transmembrane Protein) 


CPn0190 


229919 


231274 


F 




CPn0191 


231991 


231314 


R 


glnQ-ABC Amino Acid Transporter ATPase- (CT130) 


CPn0192 


232634 


231984 


R 


glnP-ABC Amino Acid Transporter Permease- (CT12 9) 


CPn0193 


233126 


232686 


R 


*argR-Arginine Repressor 


CPn019A_ 


. 233210.. 


2342.41.. 


F 


.gcp_2 -O-Sialoglycopro t ein Endopept idase_2 - ( CT1 97 ) 


CPn0195 


234190 


235785 


F 


oppA_l -Oligopeptide Binding Protein^.! 


CPn0196 


235939 


237519 


F 


oodA_2 -Oligopeptide Binding Protein_2- (CT198) 


CPn0197 


237578 


238882 


F 


oppA_3 -Oligopeptide Binding Protein_3 


CPn0198 


239169 


240746 


F 


oppA_4 -Oligopeptide Binding Protein w 4 


CPn0199 


241042 


241983 


F 


oppB_l -Oligopeptide Permease.l- (CT199) 


CPn0200 


242017 


242868 


F 


oppC_l -Oligopeptide Permease. 1- {CT200 ) 


CPn0201 


242864 


243715 


F 


oppD-Oligopeptide Transport ATPase- (CT201) 


CPn0202 


243715 


244500 


F 


oppF-Oligopeptide Transport ATPase- (CT202 ) 


CPn0203~- 


245008 


245802 


F 




CPn0204 


245817 


246002 


F 




CPn0205 


246133 


246327 


F 




CPn0206 


246409 


247161 


F 


CT203 hypothetical protein 


CPn0207 


247208 


248617 


F 


ybhl/sodiTl-Oxoglutarate/Kalate Translocator- (CT204 ) 


CPn02O8 


248953 


250602 


F 


pfkA_2-Fructose-6-P Phosphotransferase^- (CT205) 


CPn0209 


251036 


251272 


F 
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CPnQ2 I U 


O C T 1 Q A 

_ 3_ J 04 


251440 


R 




CPn021 1 


_ 3_ / do 


_ 3 _ 4 b J 


R 




CPn0212 


_ 3 4 Uo b 


_ 3_ O 0 O 


R 




CPn02X3 


Z 34 J 4_£ 


£ 34 1 y U 


R 




CPn0214 


255o57 


_£3444b 


R 




CPn0215 


257015 


_ 33 / sy 


R 




CPn0216 


257608 


2 57 174 


R 




CPn0217 


2 57896 


2 30 379 


F 


ypdP- (CT140 ) 


CPn0218 


259058 


2 58582 


R 




CPn0219 


259357 


260472 


F 


tgt-Queuine cRNA Ribosyl Transferase- (CT193 ) 


CPn0220 


2 60696 


261238 


F 




CPn0221 


O CI C C7 

_ b 1 0 3 / 


O fi? ft£_ 


F 




CPn0222 


262504 


262842 


F 


•weak similarity to Bacteriophage CHP1 (Orr4) 


CPn022 3 


262956 


26 3 J J J 


F 




CPn0224 


263435 


263674 


F 




CPn0225 


263873 


264541 


F 




CPn0226 


2o43ob 


_ b4 y b / 


F 




/■* o-» noil 


2o54_ o 


"i C C ft A Q 

£ 03U jy 


R 


aSDo-JJl Sul llue uuna uxiuoreauctase" ilii / 0/ 


CPn0228 


t ft 

_ bbll U 


i 0 34 1^ 


R 


uSOvj'UiSUlEluc Dunu LnapciOnc" ilu / ' i 


CPn0229 


266328 


267560 


F 


CT178 hypothetical protein 


rt ■! ft ft ft 


268253 


2 67576 


R 


CT179 hypothetical protein 


Cfri0231 
cfef0232 


268957 


268253 


R 


tauB-ABC Transport ATPase (Nitrate/Fe) - (CT180) 


270122 


269232 


R 


•similarity to 5 * -Methylthioadenosine / S-Adenosylhomocysteine 










Nucleosidase 


Cferf0233 


270424 


270248 


R 




CPftQ23 4 


271240 


270548 


R 


CT181 hypothetical protein 


CPn023 5 


271416 


272177 


F 


kdsB-deoxyoctulonosic Acid Synthetase- (CT182) 


o ib^'ft TIC 

CPH02 J 7 




4 'J /ob 


F 


pyrG-CTP syntnecase- no j / 


273762 


274214 


F 


yggF Family- (CT1 84 ) 


CfTtiU J Jo 


2743 03 


275838 


F 


zwf -Glucose- 6 -P Dehyrogenase- (CT185) 


(_fnu_ j y 

a 


* / 3 0 3 y 


c l bb / z 


F 


devB-Glucose- 6-P Dehyrogenase (DevB family ) — (CT186) 




_ / /obi 


276698 


R 




(.rnu_ 41 


279354 


278203 


R 




\_tTiU_ 4 _ 


_ /yyio 


279487 


R 






_ OU333 


£ oU 1 J J 


R 




V si* w -> 4 4 




-Zol33b 


F 


aouc- Adenylate Kinase- ictizh j 


r* On n _ ^ 
trgUZ 4 3 


_ O 1 b 4 3 


282499 


F 


ydhO-Polysaccharide Hydrolase-Invasin Repeat Family- (CT12 7) 


own 9 _ 


"7 Q 0 Q ^ *5 


ZoZ 331 


R 


rs9-S9 Ribosomal Protein- (CT12 6 } 






«o£ y by 


R 


rilj-L13 RiDOSomai Protein- tCTizj; 


til U £4S 




TOT f Cfl 

Z 0 J b 3 u 


R 


yc tv/yDDA-Asc Transporter ATrase- (CTjls^ ; 


f" 1 _h->ft "i A Q 


3 O 4 1 


284 333 


R 


CT151 hypothetical protein 


CPnfl? ^ n 


_ O O U 3 ' 


z o 3 y uz 


R 


_ T -5 -> TIT r> Vaca««is 1 a i n / J"" I' 1 1 Cft \ 

rijj-L33 RiDosomai protein- (CTibuj 


/-» n- nici 


286060 


287559 


F 


•conserved hypothetical protein 


rt>nn^ ^ "> 
v-rTlUi J _ 


288112 


287 57 6 


R 


CT144 hypothetical protein (frame-shift with 0253?) 


UtTlU _ 3 J 


2 884 56 


287950 


R 


CT144 hypothetical protein_l 




"J B Q 0 £ "5 


xOtJ4 3 y 


R 


CT143 hypothetical protein_l 


_■ trnv *tj j 


_ y U 1 0 3 


ztjy jzy 


R 


CT142 hypothetical protein_l 




291264 


290398 


R 


CT144 hypothetical protein_2 


CPn0257 


292127 


291267 


R 


CT143 hypothetical protein_2 


o,pnu _ so 


292 534 


292133 


R 


CT142 hypothetical protein {frame-shift with 0259?) 


U rTiU _ 3 2 


_ y_ y ob 


292441 


R 


CT142 hypothetical protein_2 


CPnO_60 




o o. t c x a 
Zy J348 


R 


secA_l-Protein Translocase Subunit_l- (CT141 ) 


WlrllU _ O X 


* y * j 


295033 


F 


ydaO-PP-Loop Superfamily ATPase- (CT2 17) 


CPn0262 


295091 




F 


surE-SurE-like Acid Phosphatase- {CT218 ) 






z y / 1 jo 


F 


yqfU hypothetical protein- (CT221 ) 


CPn0264 


29773 0 


*) Q*7 1 EC 
«7 / 1 3 3 


R 


ubiD-Phenylacrylate Decarboxylase- (CT220) 


CPn0265 


298620 


T 07*7 i n 


._ R 


. ubiA-Benzoate. Octaphenyl trans f erase-_(.CT219J. _ . _ _ ____ 


CPn026 _~ ~ 


299184 


* y y o / o 


F 




CPn0267 


300122 


j uu y i u 


F 








"1 ft 1 11 0 


F 




CPn0269 


302450 


J U14 / O 


o 
K 


Dipept idase- { CT13 8 ) 


CPn0270 


inline 


*1 ft"? d Q 
J U Z 4 0 0 


R 


ywlc-SuA5 Super family-related Protein- (CT137) 


CPn0271 


303634 


304362 


p 


T.ucnnhncnhftl i hacp PcfrPT"AQP-> f PT1 1 1 


CPn0272 


305233 


304340 


R 


dnaX_2-DNA Pol III Gamma and Tau_2- (CT1B7 ) 


CPn0273 


305844 


305227 


R 


tdk-Thymidylate Kinase- (CT188) 


CPn0274 


308353 


305852 


R 


gyrA_l-DNA Gyrase Subunit A_1-{CT189) 


CPnQ275 


310786 


308372 


R 


gyrB_l-DNA Gyrase Subunit 8_1-(CT190) 


CPn0276 


311137 


310793 


R 


CT191 hypothetical protein 


CPn0277 


311910 


311404 


R 




CPn0278 


312875 


312060 


R 


•conserved outer membrane lipoprotein protein 


CPn0279 


313537 


312875 


R 


•Possible ABC Transporter Permease Protein 


CPn0280 


314572 


313550 


R 


dppF-Dipept ide Transporter ATPase- (CT689 ) 



40 



CPn0281 


315057 


316103 


F 


CPn0282 


316126 


317529 


F 


CPn0283 


318497 


317532 


R 


CPn0284 


319045 


318551 


R 


CPn02 8 5 


320595 


319051 


R 


CPn0286 


322059 


320650 


R 


CPn0287 


324221 


322089 


R 


CPn0288 


325716 


324571 


R 


CPn0289 


325812 


326996 


F 


CPn0290 


327042 


328523 


■ F 


CPn0291 


328667 


329194 


F 


CPn0292 


329228 


329836 


F 


CPn0293 


329949 


332723 


F 


CPn0294 


333092 


333502 


F 


CPn0295 


333863 


333627 


R 


CPn0296 


334765 


334022 


R 


CPn0297 


335697 


334774 


:K 


CPn0298 


336721 


335717 




CPn0299 


336816 


337415 


r 


CPn0300 


337783 


340152 


l : 


CPn0301 


340250 


340762 


V 


<fgi0302 


340787 


341866 


r 


C§:0303 


342958 


341921 


I! 


Cgn0304 


343133 


344158 


I 


<JPn0305 


344154 


345137 


I 


GMi0306 


345145 


346431 


I 


<S§i0307 


348986 


346515 


I: 


&&n0308 


349234 


349596 


F 


G£n0309 


350974 


349595 


R 


c||i0310 


■ 353433 


351049 


R 


<l=Ph0311 


354438 


353575 


R 


GPn0312 


354524 


354976 


F 


<S£n0313 


354990 


355355 


F 


Ggp0314 


356285 


355353 


R 


<Ki0315 


356977 


358716 


F 


<fph0316 


358820 


360121 


F 


(§Pn0317 


360081 


362750 




CPn0318 


362767 


363126 


F 


cli0319 


363175 


363879 


F 


CPn0320 


363860 


364783 


F 


CPn0321 


365858 


364767 


R 


CPn0322 


366249 


367328 


F 


CPn0323 


3 673 31 


369460 


F 


CPn0324 


369492 


370688 


F 


CPn0325 


370708 


371148 


F 


CPn0326 


371148 


372725 


F 


CPn0327 


372945 


373211 


F 


CPn0328 


373241 


374992 


F 


CPn0329 


375088 


376146 


F 


CPn0330 


376675 


376202 


R 


CPn0331 


378437 


376701 


■ R 


CPn0332 


378655 


378536 


R 


CPn0333 


379090 


378800 


R 


CPn0334 


■ 379311 


379823 


F 


CPn03 3 5 


379817 


380674 


F 


_CPn03 3 6 . 


. -380650 - 


381591 


- F- 


CPn0337 


382027 


381575 


R 


CPn0338 


382278 


383375 


F 


CPn0339 


383420 


384034 


F 


CPn0340 


383842 


■ 384156 


F 


CPn0341 


384160 


384495 


F 


CPn0342 


384622 


385062 


F 


CPn0343 


84 999 


385595 


F 


CPn0344 


387420 


385558 


R 


CPn0345 


388572 


387436 


R 


CPn0346 


389675 


388704 


R 


CPn0347 


391021 


389678 


R 


CPn0348 


391803 


391027 


R 


CPn0349 


392770 


391790 


R 


CPn0350 


393181 


393684 


F 


CPn0351 


393888 


395432 


F 



dhnA- Predicted 1.6-Fructose Biphosph*.-* Aldolase (dehydrin family) - 

(C7215) 

xasA/gadC-Amino Acid Transporter- (CT216 ) 



mgtE-Mg++ Transporter (CBS Domain) - (CT1 94 } 
CT195 hypothetical protein 

aaaT-Neutral Amino Acid (Glutamate) Transporter- (CT230 ) 
Na-dependent Transporter- (CT231 ) 
incB-Inclusion Membrane Protein B-(CT232) 
incC-Inclusion Membrane Protein C-(CT233) 
CT234 hypothetical protein 

cAMP -Dependent Protein Kinase Regulatory Subunit- (CT23 5) 

acpP-Acyl Carrier Protein- (CT236) 

fabG-Oxoacyl (Carrier Protein) Reductase- (CT237) 

fabD-Malonyl Acyl Carrier Transcyclase- {CT238 ) 

fabH-Oxoacyl Carrier Protein Synthase III-(CT239) 

recR-Recombination Protein- (CT240 ) 

yaeT-Omp85 Analog- (CT2 41 ) 

(OmpH-LiJce Outer Membrane Protein) - (CT242) 
lpxD-UDP Glucosamine N-Acyl trans f erase- (CT243 ) 
CT24 4 hypothetical protein 

pdhA/odpA- Pyruvate Dehydrogenase Alpha- (CT245) 
pdhB/odpB- Pyruvate Dehydrogenase Beta-(CT2 46) 
pdhC-Dihydrolipoamide Acetyl transferase- (CT247) 
glgP-Glycogen Phosphorylase- (CT248) 
similarity to CT249 

dnaA_l- Replication Initiation Protein_l- (CT250 ) 

60IM-60)cDa Inner Membrane Protein- (CT251) 

lgt-Prolipoprotein Diacylglycerol Trans f erase- (CT252) 

CT101 hypothetical protein 

acpS -Acyl -carrier Protein Synthase- {CT100) 

trxB-Thioredoxin Reductase- (CT099 ) 

rsl-Sl Ribosomal Protein- (CT098 ) 

nusA-N Utilization Protein A-(CT097} 

infB-Initiation Factor-2- (CT096 ) 

rbf A-Ribosome Binding Factor A-(CT095) 

truB-tRNA Pseudouridine Synthase- (CT0 94 ) 

ribF-FAD Synthase- (CT0 93 ) 

ychF-GTP Binding Protein- {CT092 ) 

yscU-YopS Translocation Protein U -(CT091) 

lcrD- Low Calcium Response D-(CT090J 

lcrE- Low Calcium Response E-(CT089) 

sycE-Secretion Chaperone- (CT088) 

malQ-Glucanotransf erase- (CT087) 

rl28-L28 Ribosomal Protein- (CT086) 

CT08 5 hypothetical protein 

Phopholipase D Superfamily [leader (33) peptide] - (CT084 ) 

CT083 hypothetical protein 

CT082 hypothetical protein 

CHLTR T2 Protein- (CT081) 

ltuB-(CT080) 

CT079 similarity 

folD-Methylene Tetrahydro folate Dehydrogenase- (CT07 8) 

yo j L--( CT07 7) — - - - 

smpB- Small Protein B-(CT076) 
dnaN-DNA Pol III (beta chain) - (CT075 ) 
recF-ABC superfamily ATPase- (CT074 ) 
(frame-shift with 0339) 
(frame-shift with 0340) 

predicted OMP [leader (19) peptide] - (CT073 ) 
(frame-shift with 0342?) 
yaeL-Metalloprotease- (CT072) 
yaeM-(CT071) 

troD/ytgD- Integral Membrane Protein- (CT070 ) 
troC/ytgC-Integral Membrane Protein- (CT069 ) 
troB/ytgB-ABC transporter ATPase- (CT068 ) 
troA/ytgA-Solute Protein Binding Family- (CT0 67 ) 
CT066 hypothetical protein 
adt_ 1 -ADP/ATP Trans locase.l- (CT065) 



41 



CPn0352 


395574 


396830 


F 


CPn0353 


396893 


397135 


F 


CPn0354 


397167 


398507 


F 


CPn0355 


399889 


398591 


R 


CPn0356 


400459 


400109 


R 


CPn0357 . 


401317 


400469 


R 


CPn0358 


401751 


401578 


R 


CPn0359 


402012 


403817 


F 


CPn0360 


405358 


403922 


R 


CPn0361 


406647 


405382 


R 


GPn0362 


407825 


407055 


R 


CPn0363 


409688 


407943 


R 


CPn0364 


409966 


410238 


F 


CPn0365 


410528 


411544 


F 


CPn0366 


411976 


412440 


F 


CPn0367 


413102 


413836 


F 


CPn0368 


413790 


414107 


F 


CPn0369 


414351 


415562 


F 


CPn0370 


415800 


416912 


F 


CPn0371 


417147 


417503 


F 


CJ>a0372 


417687 


418001 


F 


C?10373 


418380 


420218 


F 


Ct0o374 


420218 


420961 


F 


CEp;0375 


421121 


421615 


F 


Cpj|0376 


421854 


422294 


F 


CjsS0377 


423438 


422347 


R 


C§fe'0378 


426168 


423445 


R 


Cf*f0379 


426322 


426765 


F 


CP§0380 


426758 


427876 


F 


Cfp0381 


429809 


428037 


R 


CPn'0382 


430749 


430036 


R 


cf>n0383 


431693 


430749 


R 


C&I0384 


432377 


431862 


R 


CPri0385 


434018 


432522 


R 


CPU0386 


434525 


434046 


R 


CPn03 87 


435196 


434699 


R 


cff0388 


435329 


437320 


F 


CE§0389 


438134 


437319 


R 


C^gO3 90 


439144 


438134 


R 


CPn0391 


439692 


439510 


R 


CPn0392 


439814 


440383 


F . 


CPn0393 


440379 


440723 


F 


CPn0394 


440736 


. 441968 


F 


CPn03 95 


441964 


443175 


F 


CPn0396 


444353 


443241 


R 


CPn0397 


445115 


444381 


R 


CPn0398 


445533 


445700 


F 


CPn0399 


445879 


446523 


F 


CPn0400 


446536 


447306 


F 


CPn0401 


447884 


447495 


R 


CPn0402 


448994 


447688 


R 


CPn0403 


449015 


449710 


' F 


CPn0404 


450887 


449871 


R 


CPn0405 


451739 


450966 


R 


CPn0406 


451969 


452865 


F 


CPn0407 


453742 


452858 


R 


CPn0408 


454105 


-454581 - 


F 


CPn0409 


454645 


455127 


F 


CPn0410 


455123 


455833 


F 


CPn04U 


455833 


456609 


F 


CPn0412 


456590 


457246 


F 


CPn0413 


459203 


457227 


R 


CPn0414 


460143 


459172 


R 


CPn0415 


461498 


460221 


R 


CPn0416. 


461856 


461557 


R 


CPn0417 


463035 


462244 


R 


C?n0418 


464401 


462953 


R 


CPn0419 


466834 


464876 


R 


CPn0420 


467108 


466824 


R 


CPn0421 


467998 


467108 


R 


CPn0422 


418242 


468784 


F 


CPn0423 


468791 


469216 


F 



lepA-GTPase-{CT064) 

gnd-6-Phosphogluconate Dehydrogenase- (CT063 ) 
tyrS-tyrosyl tRNA Synthetase- (CT062 ) 
fliA/rpsD-Sigma-28/WhiG Family- (CT061) 
f lhA-Flagellar Secretion Protein- (CT060) 
fer4-Ferredoxin IV-(CT059) 



CT058 hypothetical protein_2 
CT058 hypothetical protein^ 3 



gcpE-(CT057) 

CT056 hypothetical protein 



sucB_l-Dihydrolipoamide Succirryl transferase^- (CT055 J 
sucA-Oxoglutarate Dehydrogenase- (CT054) 
CT053 hypothetical protein 

hemN_l -Coproporphyrinogen III Oxidase_l- (CT052) 
CT326 similarity 

yabC/yraL-SAM-Dependent Methytransf erase- (CT048) 
CT047 hypothetical protein 
hctB-Histone-like Protein 2-(CT046) 
pepA-Leucyl Aminopeptidase A-(CT045J 
ssb-SS DNA Binding Protein- {CT044) 
CT043 hypothetical protein 

glgX-Glycogen Hydrolase (debranching) - (CT042) 
CT041 hypothetical protein 
ruvB-Holliday Junction Helicase- (CT040) 

dcd-dCTP Deaminase- (CT03 9) 
CT03 8 hypothetical protein 

tlyC_l-CBS Domain protein {Hemolysin Homolog) _1- (CT256 ) 
CT257 hypothetical protein 
yhfO-NifS-related protein- {CT258 ) 
PP2C phosphatase family- {CT259 ) 

CT2 53 hypothetical protein 
CT2 54 hypothetical protein 
CT2 5 5 hypothetical protein 
mutY-Adenine Glycosylase- (CT107 ) 

yceC-predicted pseudouridine synthetase family- (CT106) 
CT105 hypothetical protein 

fabl-Enoyl-Acyl -Carrier Protein Reductase- {CT104 ) 
HAD superfamily hydrolase /phosphatase- (CT103) 

CT102 hypothetical protein 

CT260 hypothetical protein 

dnaQ_l - DNA Pol III Epsilon Chain_l- (CT261) 

CT2 62 hypothetical protein 

CT263 hypothetical protein 

msbA-Transport ATP Binding Protein- (CT264 ) 

accA-AcCoA Carboxylase/Transferase Alpha- (CT265 ) 

CT2 66 hypothetical protein 

himD/ihfA-Integration Host Factor Alpha- (CT267) 

amiA-N-Acetylmuramoyl Alanine Amidase- (CT268) 

murE -N - Ace tylmuramoyl a lanyl glutamyl DAP Ligase- (CT269 ) 

pbp3- transglycolase/ transpeptidase- (CT270) 

CT271 hypothetical protein 

yabC-P3P2B Family .methyltrans 'erase- (CT272) 

CT27 3 hypothetical protein 

CT274 hypothetical protein 



A2 



CPn0424 


469612 


CPn0425 


470980 


CPn0426 


472111 


CPn0427 


472207 


CPn0428 


473722 


CPn0429 


474681 


CPn0430 


475326 


CPn0431 


476483 


CPn0432 


476816 


CPn0433 


477273 


CPn0434 


479462 


CPn0435 


480902 


CPn0436 


481618 


_CPn0437 


481816 


CPn0438 


485416 


CPn0439 


485553 


CPn0440 


486105 


CPn0441 


486891 


CPn0442 


488013 


CPn0443 


488729 


CPn0444 


490287 


Cf>n;0445 


494772 


CPn0446 


497626 


CM)447 


500568 


CP£0448 


504810 


CW0449 


507231 


CP||0450 


508112 


Cfn0451 


508275 


CPn0452 


511319 


CPib453 


513234 


Ctf|>454 


516182 


CPn0455 


520348 


CPn0456 


521532 


CPn0457 


523865 


Cm0458 


526320 


Cf&0459 


527005 


Cfn0460 


527840 


CPn0461 


528638 


cSu462 


531052 


CPg0463 


532357 


CPn0464 


532842 


CPn0465 


533212 


CPn0466 


533724 


CPn0467 


536633 


CPn0468 


539632 


CPn0469 


540399 


CPn0470 


541357 


CPn0471 


542564 


CPn0472 


547905 


CPn0473 


549593 


CPn0474 


551573 


CPn0475 


553844 


CPn0476 


554844 


CPn0477 


556106 


CPn0478 


557625 


CPn0479 


558425 


-GPn048G^ 


-^559303- 


CPn0481 


560946 


CPn0482 


561737 


CPn0483 


561836 


CPn0484 


564970 


CPn0485 


566038 


CPn0486 


567784 


CPn0487 


569740 


CPn0488 


570096 


CPn0489 


570965 


CPn0490 


571279 


CPn0491 


574352 


CPn04S2 


574652 


CPn0493 


575004 


CPn0494 


575364 


C?n0495 


575603 



470961 
471564 
471536 
473715 
474681 
475319 
476093 
476151 
476514 
476929 
477276 
479475 
480902 
464350 
484334 
486077 
486740 
487838 
488528 
489979 
494507 
497579 
500415 
503351 
503698 
505330 
507180 
511058 
512860 
516152 
519115 
519458 
520327 
522120 
524236 
526619 
526992 
527844 
529037 
531191 
532366 
532871 
536537 
539434 
540432 
541460 
542532 
545401 
545581 
548070 
549807 
551685 
553858 
554844 
556210 
557616 
--558650^ 
559339 
560961 
564964 
565824 
566229 
566405 
568112 
569767 
570096 
573333 
573336 
574804 
574855 
575146 
576793 



F 
F 
R 
F 
F 
F 
F 
R 
R 
R 
R 
R 
R 
F 
R 
F 
F 
F 
F 
F 
F 
F 
F 
F 
R 
R 
R 
F 
F 
F 
F 
R 
R 
R 
R 
R 
R % 
R % 
R 
R 
R 
R 
F 
F 
F 
F 
F 
F 
R 
R 
R 
R 
R 
R 
R 
R 
— R- 
R 
R 
F 
F 
F 
R 
R 
R 
R 
F 
R 
F 
R 
R 
F 



dnaA_2 -Replication Initiation Factor.*- (CT27 5 ) 
CT276 hypothetical proteins 
CT277 similarity 

nqr2 -NADH (Ubiquinone) Dehydrogenase- (CT278 ) 
nqr3 -NADH (Ubiquinone) Oxidoreductase, Gamma- (CT279) 
nqr4-NADH (Ubiquinone) Reductase 4-(CT280> 
nqr5-NADH (Ubiquinone) Reductase 5-(CT281) 



gcsH-Glycine Cleavage System H Protein- (CT282 ) 

CT283 hypothetical protein — 

Phospholipase D superfamily [uncleavable leader peptide] - (CT284 ) . 

IplA-Lipoate Protein Ligase-Like Protein- (CT285) 

clpC-ClpC Protease- (CT286) 

ycbF-PP-loop superfamily ATPase- (CT287) 



CT007 hypothetical protein 
CT006 hypothetical protein 
CT005 hypothetical protein 

pmp_6-Polymorphic Outer Membrane Protein G/I Family 
pmp_7 -Polymorphic Outer Membrane Protein <3 Family 
pmp_8-Polymorphic Outer Membrane Protein G Family 
pmp_9- Polymorphic Outer Membrane Protein G/I Family 
*yxjG_Bs_2 Hypothetical Protein 
pmp_10-PMP_10 (Frame-shift with 0451) 
pmp_10- Polymorphic Outer Membrane Protein G Family 
pmp_ll -Polymorphic Outer Membrane Protein G Family 
pmp_12 -Polymorphic Outer Membrane Protein A/I Family 
pmp_13 -Polymorphic Outer Membrane Protein G Family 
pmp_14 -Polymorphic Outer Membrane Protein H Family - 



(truncated) 



pmp_15-Polymorphic Outer Membrane Protein E Family 

pmp_l 6 -Polymorphic Outer Membrane Protein E Family 

pmp_17 -Polymorphic Outer Membrane Protein E Family 

pmp 17 -Polymorphic Outer Membrane Protein (Frame-shift with 0469 

pmpll7- Polymorphic Outer Membrane Protein (Frame-shift with 0470) 

pmp_18- Polymorphic Outer Membrane Protein E/F Family 



CT365 hypothetical protein 
glgB-Glucan Branching Enzyme- (CT86 6) 
CT865 hypothetical protein 
*yqeV_Bs Hypothetical Protein 
hflX-GTP Binding Protein- (CT379) 
phnP-Metal Dependent Hydrolase- (CT380) 
-CT383-hy^oth^eti_cal^prox_ein _ = 



artJ-Arginine Periplasmic Binding Protein- (CT381) 

aroG-Deoxyheptonate Aldolase- (CT382) 
CT382.1 hypothetical protein 
•hypothetical proline permease 
CT384 hypothetical protein 
hitA-HIT Family Hydrolase- (CT385 ) 
CT386 hypothetical protein 
CT387 hypothetical protein 
CT389 hypothetical protein 



aspC-Aspartate Aminotransferase- (CT390) 
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CPn0496 


576793 


577812 


F 


CPn0497 


578089 


577820 


R 


CPn0498 


579035 


578085 


R 


CPn0499 


580359 


579205 


R 


CPnO50 0 


580659 


582362 


F 


CPnOSOl 


582457 


583650 


F 


CPn0502 


583650 


584201 


F 


CPn0503 


584234 


586213 


F 


CPn0504 


586487 


588514 


F 


CPn0505 


588519 


589106 


F 


CPn0506 


589172 


589840 


F 


CPn0507 


589961 


590122 


F 


CPn0508 


590142 


590300 


F 


CPn0509 


590335 


590808 


F 


CPnOSlO 


590813 


591973 


F 


CPnOSll 


592141 


592488 


F 


CPn0512 


592553 


594412 


F 


CPn0513 


594647 


595753 


F 


CPn0514 


595729 


596520 


F 


CPn0515 


596492 


597181 


F 


CPn0516 


598814 


597255 


R 


CPn0517 


599631 


598795 


R 


C@0518 


600803 


599832 


R 


c£a05l9 


601674 


600904 • 


R 


c|n0520 
£^0521 
C&10522 


602218 


601646 


R 


603797 


602241 


R 


603987 


604655 


F 


cf|0523 


604723 


605052 


F 


QEn0524 


605103 


606179 


F 


dfen0525 


606522 


607283 


F 


<3i0526 


608696 


607710 


R 


cS0527 


609904 


608726 


R 


QPn052 8 


611162 


609921 


R 


Q&&0529 


612259 


611165 


R 


CPn0530 


613254 


612460 


R 


cfei0531 


614069 


613245 


R 


GPn0532 


614674 


614075 


R 


Cfeti0533 


614930 


615385 


F* 


CEto0534 


615413 


615784 


F 


cfea053 5 


615793 


616296 


F 


dM0536 


616345 


617691 


F 


CPn0537 


617833 


618189 


F 


CPn0538 


618212 


618511 


F 


CPn0539 


618705 


621545 


F 


CPn0540 


621694 


626862 


F 


CPn0541 


627170 


628003 


F 


CPn0542 


628003 


C *> O 711 
OZS7 J 7 


F 


CPn0543 


628725 


629603 


F 


CPn0544 


630529 


629525 


R 


CPn0545 


630884 


630633 


R 


CPn0546 


631229 


630912 


R 


CPn0547 


631661 


632188 


■ F 


CPn0548 


633231 


632191 


R 


CPn0549 


633569 


' 633255 


R 


CPn0550 


635661 


633580 


R 


CPnOSSl 


636168 


635698 


R 








— R- 


CPn0552 


636587 - 


- 636219 


CPn0553 


637747 


636812 


R 


CPn0554 


637854 


638141 


F 


CPn0555 


638298 


640241 


F 


CPn0556 


640912 


640325 


R 


CPn0557 


642861 


641194 


R 


CPn0558 


643300 


643031 


R 


C?n0559 


643742 


643927 


F 


CPn0560 


645612 


644098 


R 


CPn0561 


646404 


645871 


R 


CPn0562 


648036 


646918 


R 


CPn0563 


650056 


648293 


R 


CPn0564 


654350 


650145 


R 


CPn0565 


655630 


654533 


R 


CPn0566 


656141 


656890 


F 


CPn0567 


656894 


657817 


F 



CT391 hypothetical protein 
CT388 hypothetical protein 



proS-Prolyl tRNA Synthetase- (CT393 ) 

hrcA-HTH Transcriptional Repressor- (CT3 94 ) 

grpE-HSP-70 Cof actor- (CT3 95 > 

dnaK-HSP-70-(CT396) 

vacB-ribonuclease family- (CT3 97 ) 

•3-methyladenine DNA glycosylase 

CT421 hypothetical protein 

CT421.1 hypothetical protein 

CT421.2 hypothetical protein 

(predicted Metalloenzyme) - (CT422) 

tlyC_2-CBS Domains (Hemolysin homolog)_2- (CT423 ) 

rsbvIl-Sigma Regulatory Factor_l- (CT424 ) 

CT425 hypothetical protein 

Fe-S oxidoreductase_l- (CT426) 

CT427 hypothetical protein 

ubiE-Ubiquinone Methyltransf erase- (CT428 ) 



-(CT430) 



CT429 hypothetical protein 
dapF-Diaminopimelate Epimerase- 
clpP-CLP Protease- (CT431) 
glyA-Serine Hy droxyme thy 1 trans f erase- (CT432 ) 
CT433 hypothetical protein 

CT398 hypothetical protein 

yrbH-GutQ/KpsF Family Sugar-P Isomerase- (CT399 i 

sucB_2-Dihydrolipoamide Succinyl trans ferase_2- (CT400) 

gltT-Glutamate Symport- (CT401) 

ycaH-ATPase- (CT402) 

spoU_l-rRNA Methylase_l- (CT403 ) 

SAM dependent methyltransf erase- (CT404) 

ribC/risA-Ribof lavin Synthase- (CT405) 

CT406 hypothetical protein 

dxsA-DnaK Suppressor- (CT407 ) 

IspA-Lipoprotein Signal Peptidase- (CT408 ) 

dagA_l-D-Ala/Gly Permease_l- (CT409) 

CT814.1 hypothetical protein 

CT814 hypothetical protein 

pmp_19 -polymorphic outer membrane protein A Family -(CT412) 
pmp_20-polymorphic outer membrane protein B Family- (CT4 13 ) 
Solute binding protein ( -yebL-Synechocystis Adhesin Homolog)- 
ABC Transporter ATPase- (CT416) 
(Metal Transport Protein) - (CT417 ) 
yhbZ-GTP binding protein- (CT41 8) 
rl27-L27 ribosomal protein- (CT419) 
rl21-L21 Ribosomal Protein- (CT420) 
ygbB f amily- (CT434 ) 
cysJ-Sulfite Reductase- (CT43 5) 
rslO-SlO Ribosomal Protein- (CT436) 
f usA-Elongation Factor G-(CT437) 
rs7-S7 Ribosomal Protein- (CT438) 
^rsl-2-Sl-2^Ribosomal^PW 



(CT415) 



CT440 hypothetical protein 
tsp-Tail-Specific Protease- (CT441 ) 
crpA-15kDa Cysteine-Rich Protein- (CT442 ) 

omcB-60)cDa Cysteine-Rich Outer Membrane Complex Protein- (CT443 ) 
omcA-9kDa-Cysteine-Rich Outer Membrane Complex Lipoprotein- (CT444 ) 
CT441.1 hypothetical protein 
gltX-Glutamyl-tRNA Synthetase- (CT445 ) 
euo-CHLPS Euo Protein- (CT44 6 ) 
*CHLPS 43 JcDa protein homolog_l 

recJ-ssDNA Exonuclease- (CT447) ,^ AA ** 
secDisecF-Protein Export Proteins SecD/SecF I fusion) - (CT448) 
CT449 hypothetical- protein 
yaeS family- (CT450) 

cdsA-Phosphatidate Cyt idylytrans f erase- (CT451) 
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CPn0568 


657817 


658464 


F 


CPn0569 


656464 


659099 


F 


■ CPn0570 


659107 


660789 


F 


CPn0571 


662122 


660749 


R 


CPn0572 


662352 


664616 


F 


CPn0573 


665404 


664691 


R 


CPn0574 


665945 


665394 


R 


CPn0575 


666494 


665982 


R 


CPn057 6 


667543 


666494 


R 


CPn0576 


667598 


667530 


R 


CPn0577 


667895 


668155 


F 


CPn0578 


668406 


669365 


F 


CPn0579 


669361 


669993 


F 


CPn0580 


669993 


670793 


F 


CPn0581 


671434 


670745 


R 


CPn0582 


671503 


672177 


F 


CPn0583 


672400 


672717 


F 


CPn0584 


672707 


673798 


r 


CPn0585 


675817 


673865 


T. 


CPn0586 


676026 


677183 


T 


CPn0587 


677441 


678124 


F 


CPn0588 


678084 


678626 


F 


a||i0589 


678640 


679395 


F 


GP§0590 
c|r0591 


680112 


679516 


F 


680373 


681020 


F 


C#&0592 


681153 


681461 


F 


cfen0593 


682476 


681391 


I. 


dfll0594 


682583 


684956 


1 


GBn0595 


684958 


685926 


F 


qjfn0596 
Clfi0597 


685939 


686457 


F 


688215 


686479 


R 


dfi!i0598 


689697 


688219 


R 


GPn0599 


691802 


689682 


R 


qenoeoo 


692147 


691827 


R 


G>n0601 


693053 


692736 


R 


cSi0602 


694105 


693104 


R 


GPn0603 


694205 


695185 


F 


Cf£n0604 


695945 


695196 


R* 


CPn0605 


696707 


696150 


R 


CEB0606 


697444 


696707 


R 


Cl*n0607 


698895 


697573 


R 


CPn0608 


699645. 


699016 


R 


CPn0609 


699705 


699986 


F 


CPnO610 


701420 


700029 


R 


CPn0611 


702025 


701420 


R 


CPn0612 


704631 


702022 


R 


CPn0613 


705656 


704658 


R 


CPn0614 


707402 


705783 


R 


CPn0615 


708137 


707634 


R 


CPn0616 


708791 


710137 


F 


CPn0617 


710484 


712316 


F 


CPn0618 


712306 


713010 


• F 


CPn0619 


713444 


713013 


R 


CPn0620 


714139 


713519 


R 


CPn0621 


714647 


714144 


R 


CPn0622 


715752 


714793 


R 


~^CPn0623~ 


- -716993- 


r -7*6-1=63- ^ 




CPn0624 


718015 


717011 


R 


CPn0625 


718485 


718060 


R 


CPn0626 


719616 


718495 


R 


CPn0627 


720038 


719640 


R 


CPn0628 


720428 


720063 


R 


CPn0629 


721857 


720487 


R 


CPnO630 


".22316 


721885 


R 


CPn0631 


722806 


722312 


R 


CPn0632 


723195 


722827 


R 


CPn0633 


723757 


723209 


R 


CPn0634 


724185 


723787 


R 


C?n0635 


724745 


724206 


R 


CPn0636 


725082 


724750 


R 


CPn0637 


725464 


725099 


R 


CPr.0638 


725747 


725490 


R 



UGA 



frame-shift )-{CT45S 



cdsA-Phosphatidate Cytidylytransf erase- (CT452) 
plsC-Glycerol-3-P Acyltransf erase- (CT453 ) 
argS-Arginyl CRN A Transferase- (CT454 ) 
murA-UDP-N-Acetylglucosamine Transferase- (CT455) 
CT456 hypothetical protein 
yebC family- (CT457) 

YhhY-Amino Group Acetyl Trans f erase- <CT4 58 ) 
prfB-Peptide Chain Release Factor 2 (natural 
prfB- (natural UGA frame-shift ) 
SWIB (YM74J complex protein- (CT460J 
yael-phosphohydrolase- (CT461) 

ygbP/yacM- Sugar Nucleotide Phosphorylase- (CT462) 

truA-Pseudouridylate Synthase I-{CT463) 

Phosphoglycolate Phosphatase- {CT4 64 ) 

CT465 hypothetical protein 

CT466 hypothetical protein 

atoS/ntrB-2-Component Sensor- (CT467) 

•similarity to Cps IncA_2 

a toC/ntrC- 2 -Component Regulator- (CT4 68) 

*yvyD_Bs conserved hypothetical protein 

CT469 hypothetical protein 

CT470 hypothetical protein 

CT471 hypothetical protein 

yagE f amily- (CT472) 

yidD family- (CT473) 

CT474 hypothetical protein 

pheT-phenylalanyl tRNA Synthetase Beta-<CT475) 

CT476 hypothetical protein 

ada -methyl trans f erase- {CT477 ) 

oppC_2 -Oligopeptide Permease_2- (CT478) 

oppB 2 -Oligopeptide Permease_2- (CT479) 

oppAls-oligopeptide Binding Lipoprotein_5- (CT480) 

CT483 hypothetical protein 
CT484 hypothetical protein 
hemZ-Ferrochetalase- (CT485) 
fliY-Glutamine Binding Protein- (CT486) 
yhhF-Methylase -(CT487) 
CT488 hypothetical protein 
glgC-Glucose-1-P Adenyltransf erase- (CT489) 

•pyrF-Uridine 5 ' -Monophosphate Synthase (Ump Synthase) -truncated? 
CT490 hypothetical protein 

rho-Transcription Termination Factor- (CT491) 
yacE-predicted phosphatase/kinase- (CT492) 
polA-DNA Polymerase I-(CT493) 
sohB-Protease- (CT494 ) 
adt_2-ADP/ATP Trans locase„2 - (CT495) 

pgsA_l-Glycerol-3-P Phosphatidyl trans f erase_l- (CT496 ) 
dnaB-Replicative DNA Helicase- (CT497) 
gidA-FAD-dependent oxidoreductase- (CT498) 
lplA-Lipoate-Protein Ligase A-(CT499) 
ndx-Nucleoside-2-P Kinase- (CT500) 
ruvA-Holliday Junction Helicase- (CT501) 
ruvC-Crossover Junction Endonuclease- (CT502) 
CT503 hypothetical protein 
- — CT5 0 4- hypotheti cal— pro t e in— - 



gapA-Glyceraldehyde-3-P Dehyrogenase- (CT505) 
rll7-L17 Ribosomal Protein- (CT506 ) 
rpoA-RNA Polymerase Alpha- (CT507 ) 
rsll-Sll Ribosomal Protein- (CT508 ) 
rsl3-S13 Ribosomal Protein- (CT509) 
secY-Translocase- (CT510) 
rll5-L15 Ribosomal Protein- (CT511 ) 
rs5-S5 Ribosomal Protein- (CT512 ) 
rll8-L18 Ribosomal Protein- (CT513 ) 
rl6-L6 Ribosomal Protein- (CT514) 
rs8-S8 Ribosomal Protein- (CT51 5 ) 
rl5-L5 Ribosomal Protein- (CT516 ) 
rl24-L24 Ribosomal Procein- {CT5 17 ) 
rll4-Ll4 Ribosomal Protein- (CT518 ) 
rsl7-S17 Ribosomal Procein- (CT5 19 ) 



45 



CPn0639 
CPn0640 
CPn0641 
CPn0642 
CPn0643 
CPn0644 
CPn0645 
CPn0646 
CPn0647 
CPn0648 
CPn0649 
CPn0650 
CPn0651 
CPn0652 
CPn0653 
CPn0654 
CPn0655 
CPn0656 
CPn0657 
CPn0658 
CPn0659 
CfB0660 
CPg0661 
CeSo662 
Cp|0663 
CPra0664 
Cf§0665 
. CPn.0666 
Cf>n0667 

C fl 0668 
CS*l0669 

CPn0670 

Cfn0671 

CPn0672 

Cp0673 

ckl ! 0674 

Cf>a0675 

Cg|0676 

C%>677 

C&?0678 

CPn0679 

CPn0680 

CPn0681 

CPn0682 

. CPn0683 

CPn0684 

CPn0685 

CPn0686 

CPn0687 

CPn0688 

CPn0689 

CPn0690 

CPn0691 

CPn0692 

CPn0693 

CPn0694 

^CPnO'695"" 

CPn0696 

CPn0697 

CPn0698 

CPn0699 

CPn0700 

CPnO701 

CPn0702 

CPn0703 

CPn0704 

C?n0705 

CPn0706 

CPn0707 

CPn0708 

CPn0709 

CPn0710 



725958 
726377 
727077 
727428 
727713 
728573 
728930 
729621 
730331 
731603 
732672 
733501 
733975 
734835 
736490 
736967 
737847 
737872 
738473 
739168 
739533 
740327 
741100 
742923 
744190 
744757 
745001 
746388 
751058 
751209 
752179 
752765 
753630 
753741 
755287 
756668 
757919 
759217 
760401 
761320 
762930 
764248 
764929 
764984 
765948 
768038 
768068 
768361 
768564 
769382 
771404 
772680 
773452 
774912 
776256 
779599 
"^78021^ 
781769 
782602 
783458 
784182 
785097 
785599 
789685 
791190 
792321 
793173 
793683 
795029 
795705 
796188 
796461 



725743 
725964 
726409 
727096 
727450 
727722 
728598 
728950 
729657 
730605 
731710 
732665 
733517 
733990 
734868 
736503 
737101 
738048 
738051 
738455 
739838 
739860 
740327 
741172 
742901 
744557 
746365 
750107 
750177 
752162 
752775 
753196 
753205 
755048 
755463 
755577 
756768 
758051 
759256 
760682 
761725 
762971 
764258 
765955 
766919 
767181 
768217 
768176 
769214 
770137 
770187 
771436 
772685 
773461 
775240 
776330 
--7 81 382~ 
782599 
783447 
784201 
784721 
785609 
786672 
786929 
789685 
791209 
792334 
793180 
793704 
795034 
795742 
796210 



rl29-L29 Ribosomal Protein- (CT520 ) 
I-116-L16 Ribosomal Protein- ( CT521 ) 
rs3-S3 Ribosomal Protein- {CT522 ) 
rl22-L22 Ribosomal Protein- ( CT523 ) 
rsl9-S19 Ribosomal Protein- (CT524 ) 
rl2-L2 Ribosomal Protein- (CT52 5 ) 
rl23-L23 Ribosomal Protein- {CT526 ) 
rl4-L4 Ribosomal Protein- (CT527 ) 
r!3-L3 Ribosomal Protein- (CT528 ) 
CT529 hypothetical protein 

fmt-Methionyl tRNA Formyltransf erase- (CT530) 
lpxA-Acyl -Carrier UDP-GlcNAc -(CT531) 
fabZ-Myristoyl-Acyl Carrier Dehydratase- {CT53 2) 
lpxC-Myristoyl GlcNac Deacetylase- (CT533 ) 
cutE-Apolipoprotein N- Acetyl trans f erase- (CT534 ) 
vdlD/yciA-acyl-CoA Thioesterase- (CT535) 
dnaQ_2-DNA Pol III Epsilon Chain_2- (CT536 ) 

yjeE (ATPase or Kinase) - {CT537 ) 
CT53 8 hypothetical protein 
trxA-Thioredoxin- CCT53 9 ) 

spoU_2-rRNA Methylase_2- (CT540 ) . 
mip-FKBP-type peptidyl -prolyl cis-trans isomerase- (CT541) 
aspS-Aspartyl tRKA Synthetase- (CT542) 
hisS-Histidyl tRNA Synthetase- (CT543 ) 

uhpC-Hexosphosphate Transport -(CT544) 
dnaE-DNA Pol III Alpha- (CT545) 
predicted OMP [leader <17)-{CT546) 
CT547 hypothetical protein 
CT548 hypothetical protein 

rsbW-sigma regulatory f actor-histidine kinase- (CT54 9 ) 
CT550 hypothetical protein 

dacF (pbp5) -D-Ala-D-Ala Caroxypeptidase- (CT551) 
CT552 hypothetical protein 
fmu-RNA Methyltransf erase- (CT553) 
CT696 hypothetical protein 
homologous to CT69 5 



pgk-Phosphoglycerate Kinase- (CT693 ) 
ygo4- Phosphate Permease- (CT692) 
CT691 hypothetical protein 

dppD-ABC ATPase Dipeptide Transport- (CT6 90) 
dppF-ABC ATPase Dipeptide Transport- {CT689 ) 
spoJ/parB -Chromosome Partitioning Protein- (CT688) 



F CT482 hypothetical protein 

F CT4 81 hypothetical protein 

R y fhO_l-NifS-related Aminotransferase.!- (CT6 87) 

R ABC Transporter Membrane Protein- (CT686) 

r abcX-ABC Transporter ATPase- (CT685 ) . 

R ABC Transporter- (CT684) ,~, co -m 

r TPR Repeats (O-Linked GlcNAc Transferase similarity )- (CT683 > 

R pbp2-PBP2-transglycolase/ transpeptidase- (CT682) 
- F ^-^^ om pA-=Ma j or ^Out er-^Membr ane-^Prote in.- d CT.681)-^^--^^^r^-.^-^- 

F rs2-S2 Ribosomal Protein- (CT680 ) 

F tsf-Elongation Factor TS-(CT679) 

F pyrH-UMP Kinase- (CT67 9) 

F rrf-Ribosome Releasing Factor- (CT677 ) 

F CT676 hypothetical protein 
karG-Arginine Kinase- (CT675) 



R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

F 

R 

R 

F 

R 

R 

R 

R 

R 

F 

F 

R 

F 

F 

F 

R 

F 

F 

R 

R 

R* 

R 

R 

R 

R 



yscC/gspD-Yop C/Gen Secretion Protein D-{CT674) 
pkn5-S/T Protein Kinase- (CT673 ) 

fliN- Flagellar Motor Switch Domain/YscQ f amily- (CT672 ) 

CT671 hypothetical protein 

CT670 hypothetical protein 

yscN-Yop N (Flagellar-Type ATPase) - (CT669 ) 

CT668 hypothetical protein 

CT667 hypothetical protein 

CT666 hypothetical protein 
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CPn0711 


796731 


796486 


R 


CT665 hypothetical protein 


CPn07I2 


799315 


796781 


R 


FHA domain; homology to aaenyiate cyciasei \<.iqv*i 


CPn0713 


799721 


799332 


R 


CT663 hypothetical protein 


CPn0714 


801107 


fs /\ n rt l 

800091 


R 


ri r RNA Reductase- { CT6 62 ) 

nen\A-Gi.Utanty 1 u r\Tin r\«r\uu«- ** ' 


CPn07l5 


801657 


803462 


F 


gyrB_2-DNA Gyrase Subunit B_2-(CT661) 


CPn0716 


803469 


804902 


F 


oyrA 2-DNA Gyrase auounit h_*-n.iooui 


CPn07l7 


805010 


805306 


F 


CT656 hypothetical protein 


CPn0718 


805309 


805626 


F 


CT657 hypothetical protein 


CPn0719 


805916 


806890 


F 


s £ hB- ( Pseudounaine syntnase t - iliojo/ 


CPn0720 


. 807003 


807236 


F 


CT659 hypothetical protein 


CPn0721 


807683 


808489 


£ 


KclSA-KDU iyninetaac" iv.iojjj 


CPn0722 


808489 


808974 


F 


CT654 hypothetical protein 


CPn0723 


608984 


809703 


F 


yhbG-ABC Transporter ATPase- (CT653 ) 


_CPn072 4 


810527 


809706 


R 


CT652.1 hypothetical protein 


CPn0725 


810811 


810587 


R 


CPn0726 


813372 


810880 


R 


CT620 hypothetical protein 


CPn0727 


813577 


816192 


F 


CT619 hypothetical protein 


CPn0728 


818477 


816525 


R 


CHLPN 76JtDa Homolog_l (CT622) 


CPn0729 


819857 


818592 


R 


CHLPN 76kDa Homology {CT623) 


CPn0730 


821603 


819963 


R 


mviN- Integral Memorane frocem- ilio^**i 


CPn0731 


821587 


821760 


F 




CPn0732 


822098 


822976 


F 


nfo-Endonuclease IV-(CT625) 


CPfT§)73 3 


823727 


823101 


R 


rs4-S4 Ribosomal Protein- (CT62 6) 


CE>wQ734 
CP$&735 


823944 
825668 


824915 
825003 


F 
R 


yceA-(CT627) 

•pyx-H/udk-CJndine Kinase (uridine nonopnospnoicinasei irytiuuuAuc 










Ribonucleoside Kinase) . 


CPli0736 


827686 


825992 


R 


ygeD-Efflux Protein- (CT641) 


CRff0737 
CFEn0738 


827685 


830756 


F 


recC-Exodeoxyribonuclease V, Gamma- (CT640) 


830746 


833895 


F 


recB-Exodeoxynbonuclease v. Beta- {CTbJiM 


CPRp739 
CPTT074O 
Clti0741 


834871 


833861 


R 


CT638 hypothetical protein 


836048 


834864 


R 


tyrB-Aromatic AA Aminotransf erase- (CT637) 


838350 


836185 


R 


greA-Transcription Elongation Factor- (CT636) 


CPn0742 


838463 


838888 


F 


CT635 hypothetical protein 


CPn0743 


838962 


840362 


F 


nqrA-Ubiquinone Oxidoreductase, Alpha- (CT634 > 


CK10744 


841384 


840389 


R 


hemB- Porphobilinogen Synthase- (CT633 ) 


CPlfb745 


841903 


841742 


R 




CPjbj9746 


841975 


843567 


F 


CT632 hypothetical protein 


CP|lQ747 


843675 


843740 


F % 


CT631 hypothetical protein 


. CPn0747 
01*^748 


843725 


843910 


F 


CT631 hypothetical protein (frame-shift) 


844987 


844121 


R 


ispA-Geranyl Trans transferase- (CT62 8 ) 


CPti0749 


845629 


845006 


R 


glmU-UDP-GlcNAc Pyrophosphorylase- (CT629) 


CPn0750 


846411 


845707 


R 


tctD/cpxR-HTH Transcriptional Regulatory trocein + neceivci 
(CT630) 


CPn0751 


846608 


848434 


F 


CT651 hypothetical protein 


CPn0752 


848604 


850082 


F 


recD_2 -Exodeoxyribonuc lease V, Alpha_2- (CT652) 


CPn0753 


851006 


850161 


R 




CPn0754 


851336 


851040 


R 


rs20-S20 Ribosomal Protein- (CT6 17) 


CPn0755 


851597 


852799 


F 


CT616 hypothetical protein 


CPn0756 


852961 


854676 


F 


rpoD-RNA Polymerase Sigma-66 -(CT615) 


CPn0757 


854733 


855134 


F 


f olX-Dihydroneopterin Aldolase- (CT614) 


CPn0758 


855110 


856459 


F 


folP/dhpS-Dihydropteroate Synthase- (CT613) 


CPn0759 


856488 


856997 


• F 


f olA-Dihydrof olate Reductase- (CT612 ) 


CPn0760 


856957 


857694 


F 


CT611 hypothetical protein 


CPn0761 


857704 


858375 


F 


CT610 hypothetical protein 


CPn0762 


859597 


858539 


R 


recA-RecA recombination protein- (CT650) 


CPn0763 


860511 


859972 


R 


ygf A-FonnyltetranyaroEolate cycioiigase- iuio«»j 


— — -CPnO-764--- 


— 8-61-8 07^-^- 








—860524- - 


- R - 


-CT648 hypothetical ..protein.. . .._ — . _ . 


CPn0765 


862382 


861801 


R 


CT647 hypothetical protein 


CPn0766 


863782 


862394 


R 


CT64 6 hypothetical protein 


CPn0767 


863884 


864177 


F 


CT645 hypothetical protein 


CPn07 68 


864159 


865163 


F 


yohI/nir3 -predicted oxidoreductase -(CT644) 


CPn0769 


867733 


865121 


R 


topA-DNA Topoisomerase i-ruscu to awi uorooni i^^'ji 


CPnO770 


868340 


869131 


F 


CT642 nypotneticax protein 


CPn0771 


870463 


869144 


R 


rpoN-RNA Polymerase Sigma-54- (CT609 } 


CPn0772 


872385 


870469 


R 


uvrD-DNA Helicase- (CT608 ) 


CPn0773 


872488 


873195 


F 


ung-Uracil DNA Glycosylase- (CT607 ) 


CPn0774 


873195 


873425 


F 


CT606.1 hypothetical protein 


CPn0775 


874031 


873414 


R 


yggV family- (CT606 ) 


CPn0776 


874246 


875487 


F 


CT605 hypothetical protein 


CPn0777 


875601 


877178 


F 


groEL_2-heat shock protein-60 -(CT604) 


CPn0778 


877505 


878092 


F 


tsa/ahpC-Thio-specif ic Antioxidant (TSA) Peroxidase- (CT603 ) 


CPn0779 


678481 


878095 


R 


CT602 hypothetical protein 
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CPn0780 


879205 


878591 


R 


CPn0781 ' 


879773 


879198 


R 


CPn0782 


881065 


879773 


R 


CPn0783 


881885 


881100 


R 


CPn0784 


0 0 ^ <c ™ o 


881892 


R 


CPn0785 


882991 


882296 


R 


CPn0786 


883185 


885293 


F 


CPn0787 


885619 


686401 


F 


CPn0788 


886542 


887432 


F 


CPn0789 


887439 


889316 


F 


CPn0790 


889330 


890103 


F 


CPn0791 


893050 


890111 


R 


CPn0792 


894919 


893108 


R 


CPn0793 


896823 


894919 


R 


CPn0794 


897174 


898004 


F 


CPn0795 


898128 


899195 . 


F 


CPn0796 


899301 


901340 


F 


CPn0797 


901600 


902694 


F 


CPn0798 


902846 


903856 


F 


CPn0799 


904986 


903940 


R 


C@OBOO 


906532 


905249 


R 


CFT&0801 


908697 


906727 


R 


cM0802 


909740 


908709 


R 


Cj?n0803 


910303 


909752 


R 


d&k)804 


911059 


910310 


R 


d@i0805 


911831 


911067 


R 




913771 


911867 


R 


C^n0807 


913971 


914879 


F 


C®0808 


916287 


914956 


R 


C&R0809 


917785 


916307 


R 


GPn0810 


918184 


917825 


R 




918900 


918208 


R 


GPn0812 


919123 


920862 


F 


cfn0813 


920870 


921934 


F 


CJPh0814 


922107 


923357 


F 


Cpti0815 


923361 


925622 


F 


CFa08l6 


925615 


927102 


F* 


Cfh0817 


927115 


928287 


F 


CM0818 


928314 


928682 


F 


CPn0819 


928689 


929132 


F ■ 


CPn0820 


929120 


929659 


F 


CPn0821 


929667 


930668 


F 


CPn0822 


930756 


931229 


F 


CPn0823 


932367 


931501 


R 


CPn0824 


932662 


932378 


R 


CPn0825 


933594 


932677 


R 


CPn0826 


934310 


933612 


R 


CPn0827 


935264 


934434 


R 


CPn0828 


936271 


935267 


R 


CPn0829 


936744 


937298 


F 


CPn0830 


937444 


937959 


* F 


CPn0831 


938267 


938434 


F 


CPn0832 


939747 


938827 


R 


CPn0833 


941129 


939747 


R 


CPn0834 


941553 


942014 


F 


^CPn0835™ 


-9-45689-- 


-~9'4 : 204 : 5- 


— R - 


CPn0836 


946879 


945722 


R 


CPn0837 


947771 


947145 


R 


CPn0838 


949106 


947781 


R 


CPn0839 


949257 


950159 


F 


CPn0840 


950222 


951544 


F 


CPn084l 


951731 


954640 


F 


CPn0842 


954883 


954710 


R 


CPn084 3 


955191 


954994 


R 


CPn0844 


956730 


955270 


R 


CPn0845 


958079 


956850 


R 


CPn084 6 


959374 


958112 


R 


CPn0847 


959995 


959387 


R 


CPn0848 


961502 


960177 


R 


CPn0849 


961788 


965285 


F 


CPnC850 


965293 


966390 


F 



papQ/amiB-N-Acetylmuramoyl-L-Ala Amidase- (CT601) 
pal-Peptidoglycan-Associated Lipoprotein- (CT600) 
tolB-polysaccharide transporter- (CT599) 
CT598 hypothetical protein 
exbD-Biopolymer Transport Protein- (CT597 ) 
exbB/tolQ-polysaccharide transporter- (CT596 ) 
dsbD/xprA-Thio:disulfide Interchange Protein- (CT595 ) 
yabD/ycfH-PHP superfamily ( ureas e/pyrimidinase) hydrolase- (CT594 ) 
sdhC- Succinate Dehydrogenase-r (CT593 ) 
sdhA-Succinate Dehydrogenase- (CT592) 
sdhB-Succinate Dehydrogenase- (CT591) 
CT590 hypothetical protein 
CT589 hypothetical protein 

rbsU-sigma regulatory family protein-PP2C phosphatase (RsbW 
antagonist) -(CT588) 



eno-Enolase- (CT587) 

uvrB-Exinuclease ABC Subunit B-(CT586) 
trpS-Tryptophanyl tRNA Synthetase- (CT58 5 ) 
CT584 hypothetical protein 

gp6D-CHLTR Plasmid Paralog- (CT583 ) ™c QO , 

minD-chromosome partitioning ATPase— CHLTR plasmid protein GP5D-(CT582> 

thrS-Threonyl tRNA Synthetase- (CT581) 

CT580 hypothetical protein 

CT579 hypothetical protein 

CT578 hypothetical protein 

CT577 hypothetical protein 

lcrH_l-Low Ca Response Protein H_1-(CT576) 

mutL-DNA Mismatch Repair- (CT57 5 ) 

pepP-Aminopeptidase P-(CT574) 

CT573 hypothetical protein 

gspD/pilQ-Gen. Secretion Protein D-(CT572) 

gspE-Gen. Secretion Protein E-(CT571) 

gspF-Gen. Secretion Protein F-(CT570) 

predicted OMP [leader (16) peptide] - {CT5 69) 

CT5 68 hypothetical protein 

CT567 hypothetical protein 

CT566 hypothetical protein 

CT565 hypothetical protein 

yscT/spaR-YopT Tranlocation T-(CT564) 

yscS/fliQ-YopS/fliQ Translocation Protein- (CT563 ) 

yscR-Yop Translocation R-(CT562) 

yscL-Yop Translocation L-(CT561) 

CT560 hypothetical protein 

yscJ-Yop Translocation J-(CT559) 



lipA-Lipoate Synthetase- {CT558 ) 
lpdA-Lipoamide Dehydrogenase- (CT5 57 ) 
CT556 hypothetical protein 
mot^l-SWt/ SNF^f ami-ly=hel -icase-1 -,<CT5S5_.)- 



brnQ-Amino Acid (Branched) Transport- (CT554) 
nth-Enodnuclease III-(CT697) 

thdF-Thiophene/Furan Oxidation Protein- (CT6 98 ) 
psdD-Phosphatidylserine Decarboxylase- (CT699 ) 
CT700 hypothetical protein 
secA_2-Translocase SecA_2 - (CT701) 

CT702 hypothetical protein (frame-shift with 0643) 
CT702 hypothetical protein 
yphC-GTPase/GTP-binding protein- (CT703) 
pcnB_l-Poly A Polymerase_l- (CT7 04) 
clpX-CLP Protease ATPase- (CT705 ) 
clpP-CLP Protease Subunit- (CT7 06) 

tig/murl-Trigger Factor-peptidy 1-proly 1 isomerase- (CT707 ) 
motl_2-SWI/SNF family helicase_2- (CT708 ) 
mreB-Rod Shape Protein-Sugar Kinase- (CT709 ) 
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F 
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F 




993372 


994022 




rsfan n fl 7 


QQA "\AA 






ffpn0877 


995533 


995982 




HfPln 087 8 


996654 


995992 




rfbn 0 fl 7 0, 


997439 


996645 


ft 




QQQQC1 


Q Q. 7 A A A 
7 7 / 4 4 1 


R 


rl Wi 0 fl R 1 




i oofi^oo. 


r« 
f 


/TiEM* n fl A 7 


i nnfi^fiR 


1 ("1074 04 


r 


Lr nu oo j 


iUUOOO J 


1007573 


D 

l\ 


f^Pn 0 fl ft 4 


1009359 


i ooqooq 


D 
i\ 


f*PTl 0 fl ft ^ 


1U1UOJ3 


1 OOQd 7 
iuU7t J j 


O 

rt 


(TPTSnRflfi 
i u ooo 


101127 6 


i ni oqoa 

1U1U7UO 


D 

n. 






i m 4 1 ^7 


F 


HPn fl R R fl 
mui uooo 


1015423 




rt 


r^Pn 0 R fl 0 

™ W O O 7 




10154 62 


s 
r\ 


o r q o 


1 01 7 R05 
X v X f O w j 
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CPnOSQ 5 
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v» niu o 7 o 
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ni\j o7/ 


1028737 
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CPn0898 


1030460 


1028904 


R 


CPn0899 


1030875 


1032215 


F 


CPn0900 


1032235 


1033281 


F 


CPn0901 


1033287 


1034537 


F 


CPn0902 


1034543 


1035241 ■ 


F 


CPn0903 


1035263 


1036417 


F 


CPn0904 


1036326 


1037396 


F 


CPn0905 


1037409 


103983 5 


F 


CPn0906 


1040340 


1039915 




CPn0907- 


—104 0780 


—104044 5 -■ 


R — 


CPn0908 


1041589 


1040780 




CPn'0909 


1041637 


1041966 


F 


CPn0910 


1041979 


1043004 


F 


CPn0911 


104404 3 


1042985 




CPn0912 


1044129 


1045760 


F 


CPn0913 


1045760 


1045945 


F 


CPn0914 


1045999 


1046397 


F 


CPn0915 


1046461 


1046817 


F 


CPn0916 


1046837 


1048084 


F 


CPn0917 


1048090 


1048539 


F 


CPn0918 


1049223 


1048579 


R 


CPn0919 


1049378 


1050430 


F 


CPn0920 


1051405 


1050431 


R 


CPn0921 


1C51535 


1052293 


F 



pckA-Phosphoenolpyruvate CarboxyJcinase- (CT710J 

CT711 hypothetical protein 

CT712 hypothetical protein 

ompB-Outer Membrane Protein B-(CT713) 

gpdA -Glycerol -3 -P Dehydrogenase- (CT714) 

AgX-1 Komolog-UDP-Glucose Pyrophosphorylase- (CT715) 

CT716 hypothetical protein 

f lil-Flagellum-specif ic ATP Synthase- {CT717 ) 
CT718 hypothetical protein 
f liF-Flagellar M-Ring Protein- (CT719 ) ■ 
nifU-NifU-related protein- (CT720) 
yfhO_2-NifS-related protein.2- (CT721) 
pgmA-Phosphogly cerate Mutase- (CT722) 
yjbC-predicted pseudouridine synthase- (CT72 3 ) 
CT724 hypothetical protein 
birA-Biotin Synthetase- (CT72 5 ) 
rodA-Rod Shape Protein- (CT72 6) 

zntA/cadA-Metal Transport P-type ATPase- (CT727) 
CT728 hypothetical protein 
serS-Seryl tRNA Synthetase_2- (CT729 ) 
ribD-Ribof lavin Deaminase- (CT730) 

ribA&ribB-GTP Cyclohydratase & DHBP Synthase -{CT731) 

ribE-Ribityllumazine Synthase- (CT73 2 ) 

CT733 hypothetical protein 

CT734 hypothetical protein 

dagA_2 -D- Alanine /Glycine Permease_2- (CT735) 

ybcL family- (CT736) 

SET Domain protein- (CT737 ) 

yycJ-metal dependent hydrolase- (CT73 8 ) 

ftsK-Cell Division Protein FtsK-(CT739) 



dmpP/nqr6-Phenolhydrolase/NADH ubiquinone oxidoreductase- (CT740) 

CT741 hypothetical protein 

ygcA-rRNA Methyl trans ferse- (CT742 ) 

hctA-Histone-Like Developmental Protein- (CT743 ) 

CHLTR possible phosphoprotein- { CT744 ) 

hemG-protoporphyrinogen Oxidase- (CT745) 

hemN_2 -Coproporphyrinogen III Oxidase_2- (CT746) 

hemE-Uroporphyrinogen Decarboxylase- (CT747) 

mf d-Transcription-Repair Coupling- (CT748 ) 

alaS-Alanyl tRNA Synthetase- (CT74 9 ) 

tictB-Transketplase- (CT750) 

amn-AMP Nucleosidase- (CT751) 

efp_2 -Elongation Factor P_2-(CT752) 

CT753 hypothetical protein 

(possible phosphohydrolase) - (CT754 ) 

Mitochondrial HSP60 Chaperonin Homolog- (CT755) 

murF-Muramoyl-DAP Ligase- (CT756 ) 

mraY-Murarooyl-Pentapeptide Transferase- (CT757) 

murD-MuraLmoylalanine-Glutamate Ligase- (CT758) 

nlpD-Muramidase (invasin repeat f amily ) - (CT759J 

ftsW-Cell Division Protein FtsW-(CT760) 

murG-Peptidoglycan Transferase- (CT761) 

murC&ddlA-Muramate-Ala Ligase & D-Ala-D-Alam Ligase- (CT762 ) 
CT763 hypothetical protein 

■ •.cutA_p.er.iplasmic . Divalent Cation . Tolerance Protein Cut A ^(C-Type 

Cytochrome Biogenesis Protein) 
CT7 64 hypothetical protein 
rsbV_2 -Sigma Factor Regulator^ - (CT765) 
miaA-tRNA Pyrophosphate Transferase- (CT7 66 ) 
Fe-S cluster oxidoreductase_2- (CT767 ) 
CT7 68 hypothetical protein 



ybeB-iojap superfamily ortholog- (CT769 ) 
fabF-Acyl Carrier Protein Synthase- (CT770 ) 
hydrolase'/phosphacase homolog- (CT771 ) 
ppa-Inorganic Pyrophosphatase- (CT772) 
ldh-Leucine Dehydrogenase- (CT77 3 ) 

cysQ-Sulfite Synthesis/biphosphate phosphatase- (CT774 J 
snGlycerol-3-P Acyltransf erase- (CT775 ) 
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CPr*0922 


1052314 


1053927 


F 


aas -Acy lglycerophosphoethanolamine Acy*'. transferase* (CT776) 


CPn092 3 


1053 984 


1055093 


F 


bioF_l -Oxononanoate Synthase_l- (CT777) 


CPn0924 


1057274 


1055028 


R 


priA-Primosomal Protein N' -(CT778) 


CPn0925 


1057900 


1057226 


R 


CT779 hypothetical protein 


CPn0926 


1058060 


1058557 


F 


Thioredoxin Disulfide Isomerase- (CT780 ) 


CPn0927 


1059809 


1058670 


R 


*CHLPS 43 kDa protein homolog_2 


CPn0928 


1061008 


1059884 


R 


•CHLPS 4 3 JcDa protein homolog_3 


CPn0929 


1062292 


1061186 


R 


*CHLPS 43 icDa protein homolog_4 


CPn0930 


1062857 


1063330 


F 




CPn0931 


1064138 


1065718 


F 


lysS-Lysyl tRNA Synthetase- (CT781 ) 


CPn0932 


1067142 


1065721 


R 


cysS-Cysteinyl tRNA Synthetase- (CT782 ) 


CPn0933 


1067535 


1068578 


F 


predicted disulfide bond isomerase- (CT78 3 ) 


CPn0934 


1068942 


1068526 


R 


rnpA-Ribonuc lease P Protein Component- (CT7 84) 


CPn0935 


1069091 


1068957 


R 


rl34-L34 Ribosomal Protein- {CT785 ) 


CPn093 6 


1069336 


1069470 


F 


rl36-L36 Ribosomal Protein- (CT786) 


CPn0937 


. 1069496 


1069798 


F 


rsl4-Sl4 Ribosomal Protein- {CT787.) 


CPn0938 


1070322 


1069849 


R 


CT788 hypothetical protein -[leader (60) peptide-periplasmic] 


CPn0939 


1070728 


1071195 


F 


CT790 hypothetical protein 


CPn0940 


1073012 


1071204 


R 


uvrC-Excinuclease ABC, Sub unit C-(CT791) 


CPn0941 


1075501 


1 ft *i ^ ft 4 ft 

1073018 


R 


mutS-DNA Mismatch Repair- (CT792 ) 


CPn0942 


1075985 


1077754 


F 


dnaG/priM-DNA Primase- (CT794 ) 


CPn0943 


1077978 


1078238 


F 


CT794.1 hypothetical protein 


CPf|§944 


1078512 


1078997 


F 




CPp$945 


1079070 


1079660 


F 


CT795 hypothetical protein 


CPn0946 
CPff0947 


1082786 


1079745 


R 


glyQ-Glycyl tRNA Synthetase- (CT796 ) 


1083442 


1084059 


F 


pgsA_2 -Glycerol -3 - P- Phospha tydyl trans f erase. 2 - ( CT797 ) 


CPkJ948 


1085474 


1084047 


R 


glgA-Glycogen Synthase- (CT7 98) 


CPfl§949 


1085929 


1086483 


F 


ctc-General Stress Protein- (CT799) 


CPng950 


1086488 


1087027 


F 


pth-Peptidyl tRNA Hydrolase- (CT800) 


CPnQ951 
CPtgj952 
CPnQ953 


1087122 


1087457 


F 


rs6-S6 Ribosomal Protein- (CT801) 


1087478 


1087723 


F 


rsl8-S18 Ribosomal Protein- (CT802 ) 


1087742 


1088248 


F 


rl9-L9 Ribosomal Protein- (CT803 } 


CPn0954 


1068286 


1088708 


F 


ychB-Predicted Kinase- (CT804) 


CPnj|955 


1088612 


1089175 


F 


(frame-shift with 0954) 


CPn0956 
CPI10957 


1089560 


1090909 


F 


CT805 hypothetical protein 


i ft ft —\ ft o 

1093788 


1090963 


R 


ide/ptr-Insulinase family/ Protease III-(CT806) 


CPtf9958 


1094785 


1093793 


R 


plsB-Glycerol -3 -P Acyltransf erase- (CT807) 


CPjh&9 5 9 


1096343 


1094799 


R 


cafE-Axial Filament Protein- (CT808 ) 


CPr^960 


1096764 


1097102 


F 


CT809 hypothetical protein » 


CPnJ'961 


1097118 


1097297 


F 


rl32-L32 Ribosomal Protein- (CT810 ) 


CPni962 


1097316 


1098275 


F 


plsX-FA/Phospholipid Synthesis Protein- (CT811) 


CPn0963 


ft ft ft ^ ft ft 

1098398 


1103224 


F 


pmp_21- Polymorphic Outer Membrane Protein D Family- (CT8 12) 


CPn0964 


1104758 


1103301 


R 




CPn0965 


1106736 


1104925 


R 


lpxB-Lipid A Disaccharide Synthase- (CT4 11) 


CPnQ966 


1108037 


1106748 


R 


pcnB_2-PolyA Polymerase_2- (CT410 ) 


CPn0967 


1108512 


1109885 


F 


mrsA/pgm- Phosphoglucomutase- (CT8 1 5 ) 


CPn0968 


1109895 


1111721 


F 


glmS-Glucosamine-Fructose-6-P Aminotransferase- (CT816) 


CPn0969 


1111812 


1112999 


F 


0969-tyrP_l-Tyrosine Transport_l- (CT817 ) tyrP_l -Tyrosine Tran«port_l- 

(CT817) 


CPn097 0' 


1113461 


1114648 


F 


0 97 0-tyrP_2 -Tyrosine Transport_2- (CT818) tyrP_2 -Tyrosine Transport^ - 

(CT818) 


/-> T5»» n Q "7 1 

cpnus / i 


1114702 


1115415 ' 


F 


yccA-Transport Permease- {CT819 ) 


r* tv-, non 


1116299 


1115430 


R 


ftsY-Cell Division Protein FtsY-(CT820) 


v-rTiU S / J 


1116370 


1117527 


F 


sucC-Succinyl-CoA Synthetase, Beta-(CT821) 


<— trii U ?/4 


1117544 


1118422 


F 


sucD-Succinyl-CoA Synthetase. Alpha- (CT822 ) 


Di-i AQTC 
L.rTiU J / D 


1119104 


1119637 


F 






112 0082 — 


— : -1121185 


F 


- ■ — - _ ------- - • - - 


CPn0977 


i it 1 J / 1 


1122402 


F 




PDnDQ7 Q 
\- rTlU 3*0 




1123693 


F 






111 J Sou 


1125443 


F 


htrA-DO Serine Protease- (CT823 ) 


\_ ruU you 


1 1 £ 0 7 O £ 


1125504 


R 


•similarity to Saccharomyces serevisiae hypothetical 52.9KD protein 




1127 031 


1129952 


F 


Zinc Metalloprotease (insulinase family )- (CT824) 




1131194 


1129962 


R 


yigN f amily- (CT825 ) 


CPn0983 


1132000 


1131206 


R 


pssA-Glycerol -Serine Phosphatidyltransf erase- tCT826) 


CPn0984 


1132379 


1135510 


F 


nrdA-Ribonucleoside Reductase, Large Chain- {CT827 ) 


CPn0985 


1135534 


1136571 


F 


nrdB-Ribonucleoside Reductase, Small Chain- (CT828) 


CPn0986 


1136724 


1137395 


F 


yggH-predicted rRNA Methylase- (CT829 ) 


CPn0987 


1137516 


1138115 


F 


ytgB-like predicted rRNA methylase- (CT830) 


CPn0988 


1138986 


1138075 


R 


murB-UDP-N-Acety lenolpyruvoy lglucosamine Reductase- (CT831) 


CPH0989 


1139495 


1139016 


R 


CT832 hypothetical protein 


CPn099O 


1139883 


1140440 


F 


infC-Initiation Factor 3-(CT833) 


CPn0991 


1140421 


1140612 


F 


r!35-L35 Ribosomal Protein- {CT834 ) 
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C?n0992 


1140634 


114 0996 


F 


C?n0993 


1141014 


114 203 0 


F 


CPn0994 


1142398 


I 1 A A A A f\ 

I I 4 4 4 4 U 


r 


CPn0995 


1145512 


1 1 A A A 1 C 

1144413 




CPn0996 


1146589 


1 1 A C Q 1 Q 




CPn0997 


1146708 


1 1 X "7 CCA 
1147004 


r 


CPn0998 


1147855 


1150584 


F 


CPn0999 


1152847 


1150766 


R 


CPnlOOO 


1153157 


1152891 


R 


CPnlOOl 


1153405 


1153869 


F 


CPnl002 


1153862 


11540B9 


F 


CPni0O3 


1154796 


1154092 


R 


CPnl004 


1155397 


1154879 


R 


CPnlOOS 


1155933 


1155415 


R 


CPnl006 


1156472 


4 1 r e n ft ft 

1155990 


R 


CPT11007 


1156689 


1156907 


F 


CPnlOOS 


1156928 


1158223 


F 


CPnl009 


1159058 


1158186 


R 


CPnlOlO 


1159672 


1159067 


R 


CPnlOll 


1160306 


•« * £ ft ft ft ^ 

1159902 


R 


CPnl012 


1162193 


1160421 


R 


CPnyi3 
CPrt£oi4 


1162245 


110 J Oi!4 


f 


1165426 


1163732 


R 


C Pn® 15 


1165634 


1166693 


F 


CPn|pi6 


1167042 


n 1 f 0 0 a 0 

1168898 


F 


CPi|i017 
CPni&18 


1169006 


1 T CQQ1C 


r 


1169898 


1170629 


F 


CPrf£019 


1172128 


1170638 


R 


CPiM)20 


1173679 


1172150 


R 


CPrfg>21 


1174213 


1173698 


R 


CPn^022 


1175673 


1174216 


R 


CPnib23 


1176035 


1176331 


F 


CPrfl024 


1177236 


1176334 


R 


CPrfeb25 


1177302 


1178879 


F 


CPrfl026 


1178997 


1179137 


F 


CPnl.027 


1179175 


1180755 


F 


jjssiis _ _ 

CPrfl'028 


1181016 


1181999 


F 


CPrfR>29 


1182008 


1182844 


F 


CPrtfS3 0 


1183886 


1182843 


R 


CPnff 31 


1185552 


1184098 


R 


CPnfo3 2 


1186150 


1185566 


R 


CPnl033 


1187500 


1186187 


R 


CPnl034 


1188517 


1187732 


R 


CPnl035 


1190000 


1188570 


R 


CPnl036 


1191135 


1189984 


R 


CPnl037 


1192199 


1191123 


R 


CPnl038 


1192726 


1192199 


R 


CPnl039 


1193999 


1192665 


R 


CPnl040 


1194741 


1194073 


R 


CPnl041 


1195994 


1194726 


R 


CPnl042 


J.196590 


1195934 


R 


CPnl043 


1197717 


1196572 ' 


R 


CPnl044 


1198691 


1197699 


R 


,CPnl045 


1199590 


^ ^ ft 0 ft ft ^ 

1198901 


R 


( C Pnl 0 V6"~ "IlliO 0 6Z5~™ 


—1.19.9,590 


R 


CPnl047 


1200552 • 


1201343 


F 


—_C Pnl 04 8-. 


^-1201606„— , 


- T 12.0 2 L 6 0.4^rr 




CPnl049 


1202595 


1203914 


F 


CPnlOSO 


1203926 


1204798 


F 


CPnlOSl 


1204962 


1205270 


F 


■ CPnl052 


1205417 


1206169 


F 


CPnl053 


1206153 


1206701 


F 


CPnl054 


1207034 


1209466 


F 


CPnl055 


1209694 


1210521 


F 


CPnl056 


1210527 


1211228 


F 


CPnl057 


1211497 


1213596 


F 


CPnl058 


1213748 


1214836 


F 


CPnl059 


1214848 


1215678 


F 


CPnl060 


1217658 


1215727 


R 


CPnl061 


1217920 


1217666 


R 


CPnl062 


1219820 


1218159 


R 


CPnl063 


1219951 


1220712 


F 



rl20-L20 Ribosomal Protein- (CT835 ) 

pheS-Phenylalanyl tRNA Synthetase. Alpha- (CT836) 

CT837 hypothetical protein 

CT8 38 hypothetical protein 

CT839 hypothetical protein 

mesJ-PP-loop superfamily ATPase- (CT840 ) 

ftsH-ATP-dependent zinc protease- (CT841 ) 

pnp-Polyribonucleotide Nucleotidyltransf erase- (CT842 ) 

rsl5-S15 Ribosomal Protein- (CT843 ) 

yfhC-cytosine deaminase- (CTB44 ) 

CT84 5 hypothetical protein 

CT846 hypothetical protein 

CT847 hypothetical protein 

CT84 8 hypothetical protein 

CT849 hypothetical protein 

CT849.1 hypothetical protein 

CT850 hypothetical protein 

map-Methionine Aminopeptidase- (CT851 ) 

CT852 hypothetical protein 

CT8 53 hypothetical protein 

yzeB-ABC transporter permease- (CT854 } 

fumC-Fumarate Hydratase- {CT855 ) 

ychM-Sulfate Transporter- (CT85 6) 

CT857 hypothetical protein (possible IM protein) 

CT858 hypothetical protein 

lytB-Metalloprotease- {CT859 ) 

CT860 hypothetical protein 
CT861 hypothetical protein 
lcrH_2-Low Calcium Response_2- (CT862 ) 
CT863 hypothetical protein 

xerD-Integrase/recombinase- (CT864 ) 
pgi-Glucose-6-P Isomerase- (CT378 ) 
ltuA-(CT377) 

mdhC-Malate Dehyrogenase- (CT376 ) 

predicted D-amino acid dehyrogenase- (CT375 ) 
arcD-Arginine/Ornithine Antiporter- (CT374) 
CT373 hypothetical protein 
CT372 hypothetical protein 

Predicted OMP_l <CT371) [leader (18) peptide] 
AroE-Shikimate 5 -Dehyrogenase- (CT370) 
AroB-Dehyroquinate Synthase- (CT3 69 ) 
AroC-Chorismate Synthase- (CT3 68 ) 
. aroL-Shikimate Kinase II-(CT367) 
aroA-Phosphoshikimate Vinyltransf erase- (CT366) 

*bioA-Adenosylmethionine-8-Amino-7-Oxononanoate Aminotransferase 

•bioD-dethiobiotin synthetase 

bioF_2-Oxononanoate Synthase_2 

•bioB-Biotin Synthase 
^eojise.rved^hvTOtj^tj^al bacterial membrane protein 
^j^T^pjtophan Hyro^la'seJ) 

dapB-Dihya^o^ipTcolinate Reductase- (CT364) 



lysC-Aspartokinase III- (CT362 j 
dapA-Dihydrodipicolinate Synthase- (CT361) 



CT3 56 hypothetical protein 
CT355 hypothetical protein 
kgsA-Dimethyladenosine Transferase- (CT354) 
dxs/tkt-Transketolase- (CT331) 
CT3 30 hypothetical protein 
xseA-Exodoxyribonuclease VII- (CT3 29) 
tpiS-Triosephosphate Isomerase- (CT328) 
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CPnl064 


1220719 


1220895 


F 


CPnl065 


1221095 


1220928 


R 


CPnl066 


1221135 


1221488 


F 


CPnl067 


1221735 


1222292 


F 


CPnl068 


1223258 


1222365 


R 


CPnl069 


1223513 


1223,941 


F 


CPnl070 


1225511 


1224144 


R 


CPnl071 


1227324 


1225885 


R 


CPnl072 


1227969 


1228835 


F 


CPnl073 


1229011 


1229832 


F 



def -Polypeptide Def onnylase- (CT353 ) 
rnhB_2-Ribonuclease HII_2- (CT008 ) 
yfgA-HTH Transcriptional Regulator- (CT009 ) 



Predicted OMP_2 -{CT371) 



set. 
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Table 2 (Supplemental Data) Functional Assignments of C pneumoniae Coding Sequences. C. trachomatis genes are shown in 
parentheses. 



Amino Acid Biosynthesis 

Aromatic Family 



1039 


(CT366) 


aroA 


Phosphoshikimate Vinyltransferase 


1036 


(CT369) 


aroB 


Dehyroquinate Synthase 


1037 


(CT368) 


aroC 


Chorismate Synthase 


1035 


(CT370) 


aroE 


Shikimate 5-Dehyrogenase 


0484 


(CT382) 


aroG 


Deoxyheptonate Aldolase 


1038 


(CT367) 


aroL 


Shikimate Kinase II 


0740 


(CT637) 


cyr-B 


Aromatic AA Aminotransferase 


Aspartate Family (lysine) 




1048 


(CT363) 


asd 


Aspa tate Dehydrogenase 


1050 


(CT361) 


dapA 


Dihyclrodipicolinatc Synthase 


1047 


(CT364) 


dapB 


Dihydrodipicolinate Reductase 


0519 


(CT430) 


dapF 


Dian inopimelate Eptmerase 


1049 


(CT362) 


lysC 


Aspa tokinase III 



Serine Family 

0433 (CT282) gcsH Glyc ne Cleavage System H Protein 
0521 (CT432) glyA Serine Hydroxymethy [transferase 



- Base & Nucleotide Metabolism 



0171 




guaA 


GMP Synthase 


0172 




guaB 


Inosine S'-Monophosphase Dehydrogenase 


0608 






Uridine 5'-Monophosphate Synthase 


0735 






Uridine Kinase 


0244 


(CT128) 


adk 


Adenylate Kinase 


0894 


(CT751) 


amn 


AMP Nucleosidase 


0568 


(CT452) 


cmk 


CMP Kinase 


0392 


(CT039) 


dcd 


dCTP Deaminase 


0059 


(CT292) 


dut 


dUTP Nucleotidohydrolase 


0120 


(CT030) 


gmk 


GMP Kinase 


0619 


(CT500) 


ndk 


Nucleoside-2-P Kinase 


0984 


(CT827) 


nrdA 


Ribonucleoside Reductase, Large Chain 


0985 


(CT828) 


nrdB 


Ribonucieoside Reductase, Smai! Chain 


0236 


(CT183) 


. PyrG 


CTP Synthetase 


0698 


(CT678) 


pyrH 


UMP Kinase 


0273 


(CT188) 


tdk 


Thymidylate Kinase 


0659 


(CT539) 


rrxA 


Thiorcdoxin 


0314 


(CT099) 


trxB 


Thioredoxin Reductase 


1001 


(CT844) 


yfhC 


Cytosine Deaminase 



Biosynthesis of Cofactors 

~Biotin, z Lipoate <& Ubiquinone— ^r-r- — — — -. — ■■ j , - . _ ■■ 



1041 




bioA 


Adenosylmethionine-8-Amino-7-Oxononanoate Aminotransferase 


1044 




bioB 


Biotin Synthase 


1042 




bioD 


Dcthiobiorin Synthetase 


0923 


(CT777) 


bioFj 


Oxononanoate Synthase^ 


1043 


(CT777) 


bioF_2 


Oxononanoate Synthase_2 


0866 


(CT725) 


birA 


Biotin Synthetase 


0748 


(CT628) 


ispA 


Geranyl Transtransferase 


0832 


(CT558) 


lipA 


Lipoate Synthetase 
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0265 


(CT2J9) 


ubiA 


Benzoate Octaphenyltransferase 


0264 


(CT220) 


ubiD 


Phenylacrylate Decarboxylase 


0515 


(CT428) 


ubiE 


Ubiquinone Methyltransferase 


Folic Acid 






0759 


(CT6I2) 


folA 


Dihydrofolate Reductase 


0335 


(CT078) 


folD 


Methylene Tetrahydro folate Dehydrogenase 


0758 


(CT613) 


folP 


Dihydropteroate Synthase 


0757 


(CT614) 


foIX 


Dihydroneopterin Aldolase 


0763 


(CT649) 


ygfA 


Formyltetrahydrofolate Cycloligase 


Porphyrin 








0714 


(CT662) 


hemA 


Glutamyl tRNA Reductase 


0744 


(CT633) 


hemB 


Porphobilinogen Synthase 


0052 


(CT299) 


hemC 


Porphobilinogen Deaminase 


0890 


(CT747) 


hemE 


Uroporphyrinogen Decarboxylase 


08S8 


(CT745) 


hemG 


protoporphyrinogen Oxidase 


0138 


(CT210) 


hcmL 


Giutamate- 1 -Semialdehyde-2, 1 -Aminomutase 


0380 


(CT052) 


hemN_1 


Coproporphyrinogen III Oxidase_l 


0889 


(CT746) 


hcmN_2 


Coproporphyrinogen III Oxidase_2 


0603 


(CT485) 


hcmZ 


Ferrochetalase 


Riboflavin 






0872 


(CT731) 


ribA&ribB 


\ GTP Cyclohydratase & DHBP Synthase 


0532 


(CT405) 


ribC 


Riboflavin Synthase 


0871 


(CT730) 


ribD 


Riboflavin Deaminase 


0873 


(CT732) 


ribE 


Ribityllumazine Synthase 


0320 


(CT093) 


ribF 


FAD Synthase 



Cell Envelope 

Fatty Acid & Phospholipid Metabolism 



0161 


(CT206) 




(predicted acyltransferase family) 


0922 


(CT776) 


aas 


Acylglycerophosphoethanolamine Acyltransferase 


0414 


(CT265) 


accA 


AcCoA Carboxylase/Transferase Alpha 


0183 


(CT123) 


accB 


Biotin Carboxyl Carrier Protein 


0182 


(CT124) 


accC 


Biotin Carboxylase 


0058 


(CT293) 


accD 


AcCoA Carboxylas&Transferase Beta 


0295 


(CT236) 


acpP 


Acyl Carrier Protein 


0313 


(CT100) 


acpS 


Acyl-carrier Protein Synthase 


0567 


(CT451) 


cdsA 


Phosphatidate Cytidylytransferase 


0297 


(CT238) 


fabD 


Malonyl Acyl Carrier Transcyclase 


0916 


(CT770) 


fabF 


Acyl Carrier Protein Synthase 


0296 


(CT237) 


fabG 


Oxoacyl (Carrier Protein) Reductase 


0298 


(CT239) 


fabH 


Oxoacyl Carrier Protein Synthase III 


0406 


(CT104) 


fabl 


Enoyl-Acyl-Carrier Protein Reductase 


0651 


(CT532) 


fabZ 


Myristoyl-Acyl Carrier Dehydratase 


0098 


(CTOI0) 


htrB 


Acyltransferase 


0271 


(CT136) 




Lysophospholipase Esterase 


0615 


(CT496) 


pgsAJ 


Glycerol-3-P PhosphatidyltransferaseJ 


0947 


(CT797) 


pgsA_2 


Glycerol-3-P Phosphatydyltransferase_2 


0958 


-(CT807)— 






::plsB --- 


- Glycerol-3-P-Acyltransferase- - - 


0569 


(CT453) 


ptsC 


Glycerol-3-P Acyltransferase 


0962 


(CT811) 


ptsX 


FA/Phospholipid Synthesis Protein 


0839 


(CT699) 


psdD 


Phosphatidylserine Decarboxylase 


0983 


(CT826) 


pssA 


Glycerol-Serine Phosphatidyls nsferase 


0921 


(CT775) 




snGlycerol-3-P Acyltransferase 


0654 


(CT535) 


yciA 


Acyl-CoA Thioesterase 


0877 


(CT736) 


ybcL 


CT736 Hypothetical Protein 



LPS 
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0154 


(CT208) 


gscA 


KDO Transferase 






0721 


(CT655) 


kdsA 


KDO Synthetase 






0235 


(CT182) 


kdsB 


Deoxyoctutonosic Acid Synthetase 






0650 


(CT531) 


IpxA 


Acyl-Carrier UDP-GlcNAc O-Acyltransferase 




5 


0965 


(CT411) 


IpxB 


Lipid A Disaccharide Synthase 






0652 


(CT533) 


IpxC 


Myristoyl GlcNac Deacetylase 






0302 


(CT243) 


IpxD 


UDP Glucosamine N-Acyltransferase 






Membrane Proteins. Lipoproteins & Porins 






0310 


(CT251) 


60IM 


60kDa Inner Membrane Protein 




10 


0556 


(CT442) 


crpA 


15kDa Cysteine-Rich Protein 






0653 


(CT534) 


cut£ 


Apolipoprotein N-Acetyltransferase 






03H 


(CT252) 


Igt 


Prolipoproiein Diacylglycerol Transferase 






0558 


(CT444) 


omcA 


9kDa-Cysteine-Rich Lipoprotein 






0557 


(CT443) 


omcB 


60kDa Cysteine-Rich OMP 




15 


0695 


.{CT681) 


□mpA 


Major Outer Membrane Protein 






0854 


(CT713) 


ompB 


Outer Membrane Protein B 






0781 


(CT600) 


pal 


Peptidoglycan-Associated Lipoprotein 






0300 


(CT241) 


yacT 


Omp85 Ho mo log 






Pepddoglycan 






n 


L\j 


0417 


(CT268) 


amiA 


N-Acetylmuramoyl Alanine Amidasc 






0780 


(CT601) 


amiB 


N-Acetylmuramoyl-L-Ala Amidase 


-P= 




0672 


(CT5S1) 


dacF 


D-Ala-D-Ala Caroxypeptidase 


ii! 




0968 


(CT816) 


glmS 


Glucosamine-Fructose-6-P Aminotransferase 






0749 


(CT629) 


glmU 


UDP-GlcNAc Pyrophosphorylase 




?S 


0900 


(CT757) 


mraY 


Muramoyl-Pentapepride Transferase 






0571 


(CT455) 


murA 


UDP-N-Acetylglucosamine Transferase 






0988 


(CT831) 


murB 


UDP-N-Acetylenolpynivoylglucosarnine Reductase 






0905 


(CT762) 


murC&ddlA Muramate-Ala Ligase & D-Ala-D-Alam Ligase 


u 




0901 


(CT758) 


murD 


Muramoylalanine-Glutamate Ligase 




30 


0418 


(CT269) 


murE 


N-Acetylmuramoylalanylglutamyl DAP Ligase 


3s 




0899 


(CT756) 


murF 


Muramoyi-DAP Ligase 






0904 


(CT761) 


murG 


Pepddoglycan Transferase 


|ss 




0902 


(CT759) 


nlpD 


Muramidase (invasin repeat family) 


% s-3 




0694 


(CT682) 


pbp2 


PBP2-Transglycolase/Transpeptidase 




?S 


0419 


(CT270) 


pbp3 


Transglycolase/Transpeptidase 






0421 


(CT272) 


yabC 


PBP2B Family Methyltransferase 












Cellular Processes 






Cell Division 








40 


0959 


(CT808) 


cafE 


Axial Filament Protein 






0880 


(CT739) 


ftsK 


Cell Division Protein FtsK 






0903 


(CT760) 


ftsW 


Cell Division Protein FtsW 






0972 


(CT820) 


ftsY 


Cell Division Protein FtsY 






0617 


(CT498) 


gidA 


FAD-depcndent Oxidoreductase 




45 


0805 


(CT582) 


minD 


Chromosome Partitioning ATPase 






0850 


{CT709) 


mreB 


Rod Shape Protein-Sugar Kinase 






0867 


(CT726) 


rodA 


Rod Shape Protein 












■^—ehromosome:Partitioning-Protein-^-— . — — ... 







---0684 


=(CT688)^ 


— parB— 








Detoxtification 








50 


0057 


(CT294) 


sodM 


Superoxide Dismutase (Mn) 






0778 


(CT603) 


ahpC 


Thio-specific Antioxidant (TSA) Peroxidase 






Signal Transduction 










0148 


(CTI45) 




S/T Protein Kinase 






0584 


(CT467) 


atoS 


Two-Component Sensor 




55 


0294 


(CT235) 




cAMP-Dependem Protein Kinase Regulatory Subunit 






0712 


(CT664) 




(FHA domain) 



55 







0478 


(CT379) 


hflX 


GTP Binding Protein 






0703 


(CT673) 




S/T Protein Kinase 






0095 


(CT301) 




S/T Protein Kinase 






0397 


(CT259) 




PP2C Phosphatase Family 




5 


0037 


(CT337) 


ptsH 


PTS Phosphocarrier Protein Hpr 






0038 


(CT336) 


ptsl 


PTS PEP Phosphotransferase 






0060 


(CT291) 


ptsNJ 


PTS II A Protein J 






0061 


(CT290) 


ptsN_2 


PTS IIA Protein + HTH DNA-Binding Domain 






0262 


(CT218) 


surE 


SurE-like Acid Phosphatase 




10 


0838 


(CT698) 


thdF 


Thiophene/Furan Oxidation Protein 






0693 


(CT683) 




TPR RepeaU-CT683 Hypothetical Protein 






0321 


(CT092) 


ychF 


GTP Binding Protein 






0544 


(CT418) 


yhbZ 


GTP binding protein 






0844 


(CT703) 


yphC 


GTPase/GTP-binding protein 




15 


Standard Protein Secretion 








0115 


(CT025) 


ffh 


Signal Recognition Particle GTPase 






0363 


(CT060) 


flhA 


Flagellar Secretion Protein 






0858 


(CT717) 


nil 


Flagellum-speciflc ATP Synthase 






0704 


(CT672) 


fliN 


Flagellar Motor Switch Domain/YscQ family 




20 


0815 


(CT572) 


gspD 


Gen. Secretion Protein D 


*0 




0816 


(CT571) 


gspE 


Gen. Secretion Protein E 






0817 


(CT570) 


gspF 


Gen. Secretion Protein F 






0359 


(CT064) 


lepA 


GTPase 


i 




0110 


(CT020) 


lepB 


Signal Peptidase I 




25 


0535 


(CT408) 


IspA 


Lipoprotein Signal Peptidase 


h 




0260 


(CT14I) 


secA_l 


Protein Translocase SubuniM 


111 




0841 


(CT701) 


sccA_2 


Translocase SecA_2 






0564 


(CT448) 


sccD&secF Protein Export Proteins SecD/SecF (fusion) 






0075 


(CT321) 


secE 


Preprotein Translocase 


t' » 


30 


0629 


(CT510) 


secY 


Translocase 






0848 


(CT707) 


tig 


Trigger Factor-Peptidyl-proiyl Isomerase 






Transport-Related Proteins 








0486 






Hypothetical Proline Permease 






0289 


(CT230) 


aaaT 


Neutral Amino Acid (Glutamate) Transporter 




35 


0691 


(CT685) 


abcX 


ABC Transporter ATPase 






1031 


(CT374) 


arcD 


Arginine/Omilhine Antiporter 






0482 


(CT381) 


artJ 


Argiriine Periplasmic Binding Protein 






0836 


(CT554) 


bmQ 


Amino Acid (Branched) Transport 






0536 


(CT409) 


dagAJ 


D-Ala/Gly PermeaseJ 




40 


0876 


(CT735) 


dagA_2 


D- Alanine/Glycine Permease_2 






0682 


(CT690) 


dppD 


ABC ATPase Dipepride Transport 






0683 


(CT689) 


dppF 


ABC ATPase Dipepride Transport 






0280 


(CT689) 


dppF 


Dipcptide Transporter ATPase 






0785 


(CT596) 


exbB 


Macromolecule Transporter 




45 


0784 


(CT597) 


exbD 


Biopolymer Transport Protein 






0604 


(CT486) 


fliY 


Glutamine Binding Protein 






0192 


(CT129) 


glnP 


ABC Amino Acid Transporter Permease 






019.1- 


r ^(CT.I30)_ 


-,.gln.Q___ 


ABC Amino Acid Transporter ATPase 






0528 


(CT401) 


gHT 


Glutamate Symport 




50 


0286 


(CT194) 


mgtE 


Mg^ Transporter (CBS Domain) 






0413 


(CT264) 


msbA 


Transport ATP Binding Protein 






0290 


(CT231) 




Na + *dependent Transporter 






0195 


(CT198) 


oppA^l 


Oligopeptide Binding Protein l 






0196 


(CT198) 


oppA_2 


Oligopeptide Binding Protein_2 




55 


0197 


(CT139) 


oppA_3 


Oligopeptide Binding Protcin_3 






0198 


(CT175) 


oppA_4 


Oligopeptide Binding Protein 4 



56 







0599 


(CT480) 


oppA_5 


Oligopeptide Binding Lipoprotein s 






0199 


(CT199) 


oppBJ 


Oligopeptide PermeaseJ 






0598 


(CT479) 


oppB_2 


Oligopeptide Permease_2 






0200 


(CT200) 


oppCJ 


Oligopeptide Permease_l 




5 


0597 


(CT478) 


oppC_2 


Oligopeptide Permease_2 






0201 


(CT201) 


oppD 


Oligopeptide Transport ATPase 






0202 


(CT202) 


oppF 


Oligopeptide Transport ATPase 






023! 


(CT180) 


tauB 


ABC Transport ATPase (Nitrate/Fe) 






0782 


(CT599) 


tolB 


Macromolecule Transporter 




10 


0969 


(CT817) 


tyrPJ 


Tyrosine Transport_l 






0970 


(CT8.18) 


tyrP_2 


Tyrosine Transport_2 






0665 


(CT544) 


uhpC 


Hexosphosphate Transport 






0282 


(CT216) 


xasA 


Amin.i Acid Transporter 






0207 


(CT204) 


ybhl 


dicarboxylate Translocator 




15 


0971 


(CT8I9) 


yccA 


Transport Permease 






0248 


(CT152) 


ycfV 


ABC Transporter ATPase 






1014 


(CT856) 


ychM 


Sulfa t: Transporter 






0736 


(CT64I) 


ygeD 


Efflu: . Protein 


G 




0680 


(CT692) 


ygo4 


Phosr i late Permease 




20 


0723 


(CT653) 


yhbG 


ABC Transporter ATPase 






0023 


(CT348) 


yiiK 


ABC Transporter Protein ATPase 


|= 




0127 


(CT034) 


ytfF 


Catior ic Amino Acid Transporter 


|2 




0349 


(CT067) 


ytgA 


Sotut; Protein Binding Family 


ll 




0348 


(CT068) 


ytgB 


ABC transporter ATPase 


ly 


25 


0347 


(CT069) 


ytgC 


Integral Membrane Protein 


111 




0346 


(CT070) 


ytgD 


Integral Membrane Protein 






1012 


(CT854) 


yzeB 


ABC Transporter Permease 






0868 


(CT727) 


zntA 


Metal Transport P-type ATPase 






0279 






Possible ABC Transporter Permease Protein 


14= 


30 


0543 


(CT417) 




(Metal Transport Protein) 


|4 




0692 


(CT684) 




ABC Transporter 






0542 


(CT416) 




ABC Transporter ATPase 






0690 


(CT686) 




ABC Transporter Membrane Protein 






0541 


(CT415) 




solute binding protein 




35 


Type-Ifl Secretion 










0323 


.(CT090) 


IcrD 


Low Catcium Response D 






0324 


(CT089) 


lcrE 


Low Calcium Response E 






0811 


(CT576) 


lcrH_l 


LowCa Response Protein H_l 






1021 


(CT862) 


lcrH_2 


Low Calcium Response_2 




40 


0325 


(CT088) 


sycE 


Secretion Chaperone 






0702 


(CT674) 


yscC 


Yop C/Gen Secretion Protein D 






0828 


(CT559) 


yscJ 


Yop Translocation J 






0826 


(CT561) 


yscL 


Yop Translocation L 






0707 


(CT669) 


yscN 


Yop N (Ftagetlar-Type ATPase) 




45 


0825 


(CT562) 


yscR 


Yop Translocation R 






0824 


(CT563) 


yscS 


YopS Translocation Protein 






0823 


(CT564) 


yscT 


YopT Tranlocation T 






0322 


(CT09I) 


yscU 


Yop Translocation Protein U 




50 








Central Intermediar 






Glycogen Metabolism 










0856 


(CT715) 




UDP-Glucose Pyrophosphorylase 






0948 


(CT798) 


glgA . 


Glycogen Synthase 






0475 


(CT866) 




Glucan Branching Enzyme 




55 


0607 


(CT489) 


«igc 


Glucose- 1 -P Adeny I transferase 






0307 


(CT248) 




Glycogen Phosphorylase 






0388 


(CT042) 




Glycogen Hydrolase (debranching) 



57 



0326 (CT087) malQ 
0851 (CT710) pckA 



Phosphorous Sc. Sulfur 



0548 


(CT435) 




0920 


(CT774) 


cysQ 


0025 


(CT346) 


atsA 


0918 


(CT772) 


PPa 


DNA Mismatch Repair 




0505 






0812 


(CT575) 


mulL 


0941 


(CT792) 


mutS 


0402 


(CT107) 


mutY 


0732 


(CT625) 


nfo 


0837 


(CT697) 


nth 


DNA Modification 




0596 


(CT477) 


a da 


0114 


(CT024) 


hemK 


0891 


(CT748) 


mfd 


0620 


(CT501) 


ruvA 


0390 


(CT040) 


mvB 


0621 


(CT502) 


ruvC 


0053 


(CT298) 


sms 


0773 


(CT607) . 


ung 


1062 


(CT329) 


xseA 


DNA Recombination 




0762 


(CT650) 


rccA 


0738 


(CT639) 


recB 


0737 


(CT640) 


recC 


0123 


(CT033) 


recD_l 


0752 


(CT652) 


rccD_2 


0339 


(CT074) 


recF 


0340 


(CT074) 




0563 


(CT447) 


recJ 


0299 


(CT240) 


recR 


DNA Replication 




0309 


(CT250) 


dnaA 1 


0424 


(CT275) 


dnaA 2 


0616 


(CT497) 


dnaB 


0666 


(CT545) 


dnaE 


0942 


(CT794) 


dnaG 


0338 


(CT075) 


dnaN 


0410 


(CT261) 


dnaQ_I 


0655 


(CT536) 


dnaQ 2 


0040 


(CT334) 


dnaX I 


0272 


(CT187) 


dnaXJ 


- -0149 r 


~(CT-r4'6)- - 


^diiU -= * 


0274 


(CT189) 


gyAj 


0716 


(CT660) 


gyrA_2 


0275 


(CT190) 


gyrB_l 


0715 


(CT66I) 


gyrB.2 


0416 


(CT267) 


htmD 


0612 


(CT493) 


polA 


0924 


(CT778) 


priA 


0386 


(CT044) 


ssb 



Glucanotransferase 
Phosphoenolpymvate Carboxylase 

Sulfite Reductase 

Sulfite Synthesis/Biphosphate Phosphatase 

Sulphohydrolase 

Inorganic Pyrophosphatase 

DNA Replication, Modification, Repair & Recombination 

3-Methyladenine DNA Glycosyiase 
DNA Mismatch Repair 
DNA Mismatch Repair 
Adenine Glycosyiase 
Endonuclease IV 
Enodnuclease III 

Methyltransferase 
A/G-specific Methyiase 
Transcription-Repair Coupling 
Holliday Junction Helicase 
Holliday Junction Helicase 
Crossover Junction Endonuclease 
Sms Protein 

Uracil DNA Glycosyiase 
Exodoxyribonuclease VII 

RecA Recombination Protein 
Exodeoxyribonuc lease V. Beta 
Exodeoxyribonuc lease V, Gamma 
Exodeoxyribonuc lease V (Alpha Subunit)_l 
Exodeoxyribonuc I ease V, Alpha_2 
ABC Superfamily ATPase 
(frame-shift with 0339) 
ssDNA Exonuclease 
Recombination Protein 

Replication Initiation Protein_l 
Replication Initiation Factor_2 
Replicative DNA Helicase 
DNA Pol III Alpha 
DNA Primase 
DNA Pol III (Beta) 
DNA Pol III Epsilon ChainJ 
DNA Pol III Epsilon Chain_2 
DNA Pol III Gamma and TauJ 
DNA Pol III Gamma and Tau_2 

~DNA~l:igasc ~ ~ " ~ ~ ~ ~ ~ rrrr 

DNA GyTase Subunit A_l 
DNA Gyrase Subunit A 2 
DNA Gyrase Subunit B_l 
DNA Gyrase Subunit B_2 
Integration Host Factor Alpha 
DNA Polymerase I 
Primosomal Protein N* 
SS DNA Binding Protein 
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0835 


(CT555) 




SWl/SNF family helicasej 


0849 


(CT708) 




SWI/SNF family hclicase_2 


0769 


(CT643) 


topA 


DNA Topoisomerase I-Fused to SWI Domain 


0024 


(CT347) 


xcrC 


Integrase/recombinase 


1024 


(CTS64) 


xcrD 


Integrase/recombinase 


Eukaryotic*Type Chromatin Factors 


0886 


(CT743) 


hctA 


Histone-Likc Developmental Protein 


0384 


(CT046) 


hctB 


Histone-like Protein 2 


0878 


(CT737) 




SET Domain protein 


0577 


(CT460) 




SWIB (YM74) Complex Protein 


VVR Exinuclease Repair System 




0096 


(CT333) 




Excinuclease ABC Subtinit A 


0801 


(CT586) 




Exinuclease ABC Subunit B 


0940 


(CT791) 




Excinuclease ABC, Subunit C 


0772 


(CT608) 


uvrD 


DNA Hclicase 








Energy Metabolism 


Aerobic 








0855 


(CT714) 


gpdA 


Glycerol-3-P Dehydrogenase 


0743 


(CT634) 


nqrA 


Ubiquinone Oxidoreductase, Alpha 


0427 


(CT278) 


nqr2 


NADH (Ubiquinone) Dehydrogenase 


0428 


(CT279) 


nqr3 


NADH (Ubiquinone) Oxidoreductase, Gamma 


0429 


(CT280) 


nqr4 


NADH (Ubiquinone) Reductase 4 


0430 


(CT281) 


nqr5 


NADH (Ubiquinone) Reductase 5 


0883 


(CT740) 


nqr6 


Phenolhydrolase/NADH (Ubiquinone) Oxidoreductase 6 


A TP Biogenesis and 


metabolism 




0351 


(CT065) 


adtj 


ADP/ATP TranslocaseJ 


0614 


(CT495) 


adtj 


ADP/ATP Translocase_2 


0088 


(CT308) 


atpA 


ATP Synthase Subunit A 


0089 


(CT307) 


atpB 


ATP Synthase Subunit B 


0090 


(CT306) 


atpD 


ATP Synthase Subunit D 


0086 


(CT310) 


acpE 


ATP Synthase Subunit E 


0091 


(CT305) 


atpl 


ATP Synthase Subunit I 


0092 


(CT304) 


atpK 


ATP Synthase Subunit K 


0860 


(CT719) 


fliF 


Flagellar M-Ring Protein 


Electron Transport Chain 




0102 


(CT013) 


cydA 


Cytochrome Oxidase Subunit I 


0103 


(CT014) 


cydB 


Cytochrome Oxidase Subunit It 


0364 


(CT059) 




Ferredoxin 


0084 


(CT312) 




Predicted Ferredoxin 


Glycolysis & Gluconeogenesis 




028! 


(CT215) 


dhnA 


Predicted 1,6-Fructose Biphosphate Aldolase 


0800 


(CT587) 


cno 


Enolase 


0624 


(CT505) 


gapA 


Glyceraldchyde-3-P Dehyrogenase 


0056 


(CT295) 


mrsA 


Phosphomannomutase 


0967 


(CT815) 


pgm 


Phosphoglucomutase 


0160 


(CT207) 


pfkA_l 


Fructose-6-P Phosphotransferase_l 


T 0208 


~(CT205F 




— Fructose-6^P Phosphotransferase_2 


1025 


(CT378) 


Pgi 


Glucosc-6-P Isomerase 


0679 


(CT693) 


Pgk 


Phosphoglycerate Kinase 


0863 


(CT722) 


pgmA 


Phosphoglycerate Muiase 


0097 


(CT332) 


pyk 


Pyruvate Kinase 


1063 


(CT328) 


tpiS 


Triosephosphate Isomerase 


Pentose Phosphate Pathway 




0239 


(CT186) 


dcvB 


Glucose-6-P Dehyrogenase (DevB family) 


1060 


(CT33I) 


dxs 


Transketolase 
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0360 


(CT063) 


gnd 


6-Phosphogluconate Dehydrogenase 


0185 


(CT121) 


rpc 


Ribulose-P Epimerase 


0141 


(CT213) 


rpiA 


Ribose-5-P Isomerase A 


0083 


(CT313) 


tal 


Transaldolase 


0893 


(CT750) 


LklB 


Transketolase 


0238 


(CT185) 


zwf 


Glucose-6-P Dehyrogenase 


Pyruvate Dehydrogenase 




0833 


(CT557) 


IpdA 


Lipoamide Dehydrogenase 


0436 


(CT285) 


tplAJ 


Lipoate Protein Ligase-Like Protein 


0618 


(CT499) 


[plA_2 


Lipoate-Protein Ligasc A 


0033 


(CT340) 


pdhA&B 


Oxoiso valerate Dehydrogenase a/(J Fusio 


0304 


(CT245) 


pdhA 


Pyruvate Dehydrogenase Alpha 


0305 


(CT246) 


pdhB 


Pyruvate Dehydrogenase Beta 


0306 


(CT247) 


pdhC 


Dihydro lipoamide Acetyltransferase 


TCA Cycle 






0495 


(CT390) 


aspC 


Aspartate Aminotransferase 


1013 


(CT855) 


fumC 


Fumaratc Hydratase 


1028 


(CT376) 


mdhC 


Malate Dehyrogenase 


0789 


(CT592) 


sdhA 


Succinate Dehydrogenase 


0790 


(CT591) 


sdhB 


Succinate Dehydrogenase 


0788 


(CT593) 


sdhC 


Succinate Dehydrogenase 


0378 


(CT054) 


sucA 


Oxoglutarate Dehydrogenase 


0377 


(CT055) 


sucBJ 


Dihydrolipoamide SuccinyltransferascJ 


0527 


(CT400) 


sucB_2 


Dihydrolipoamide Succinyltransferase_2 


0973 


(CT821) 


sucC 


Succinyl-CoA Synthetase, Beta 


0974 


(CT822) 


sucD 


Succinyl-CoA Synthetase, Alpha 



Protein Folding, Assembly & Modification 



Chaperones 






0949 


(CT799) 


etc 


General Stress Protein 


0534 


(CT407) 


dksA 


DnaK Suppressor 


0032 


(CT341) 


dnaJ 


Heat Shock Protein J 


0503 


(CT396) 


dnaK 


Hsp-70 


0134 


(CT1I0) 


groELJ 


Hsp-60J 


0777 


(CT604) 


groEL_2 


Hsp-60_2 


0898 


(CT755) 


groEL_3 


Hsp-60_3 


0135 


(CTlil) 


groES 


lOKDa Chaperonin 


0502 


(CT395) 


grpE 


HSP-70 Cofactor 


0661 


(CT54I) 


mip 


FKBP-type Peptidyl-prolyl Cis-Trans Isomerase 


Proteases 








0144 


(CT113) 


clpB 


Op Protease ATPase 


0437 


(CT286) 


clpC 


ClpC Protease 


0520 


{CT431) 


clpPJ 


CLP Protease 


0847 


(CT706) 


dpP_2 


CLP Protease Subunit 


0846 


<CT705) 


clpX 


CLP Protease ATPase 


0269 


(CT138) 




Dipepridase 


0998 


(CT841) 


ftsH 


ATP-dependent Zinc Protease 


0030 


(CT-343) 


gcp_l— 


O-Sialoglycoprotein Endopeptidase_l — - 


0194 


(CT197) 


gcp_2 


O-Sialoglycoprotein Endopeptidase_2 


0979 


(CT823) 


htrA 


DO Serine Protease 


0957 


(CT806) 


ide 


Insulinase family/Protease ill 


0027 


(CT344) 


Ion 


Lon ATP-dependent Protease 


1017 


(CT859) 


lytB 


Metallopro tease 


1009 


(CT851) 


map 


Methionine Aminopeptidase 


0385 


(CT045) 


pepA 


Leucyl Aminopeptidase A 


0136 


(CT112) 


pepF 


Oiigopeptidase 
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0813 


(CT574) 


pepP 


Aminopcptidase P 


0613 


(CT494) 


so KB 


Protease 


0555 


(CT441) 


tsp 


Tail-Specific Protease 


0344 


(CT072) 


yaeL 


Metal'.opro tease 


0981 


(CT824) 




Zinc Metalloprotease (insulinase family) 


Protein fsomerases 






0227 


(CT176) 


dsbB 


Disulfide bond Oxidoreductase 


0786 


(CT595) 


dsbD 


Thio:disulfide Interchange Protein 


0228 


(CT177) 


dsbG 


Disulfide Bond Chaperone 


0933 


(CT783) 




Predicted Disulfide Bond Isomerase 


0926 


(CT780) 




Thiorcdoxin Disulfide Isomerase 



Transcription 

RNA Degradation 







0999 


(CT842) 


pnp 


Polyribonucleotide Nucleotidyltransferase 




5 


0054 


(CT297) 


mc 


Ribonuclease III 






0119 


(CT029) 


mhBJ 


Ribonuclease HII_1 






1068 


(CT008) 


mhB_2 


Ribonuclease HII_2 






0934 


(CT784) 


rnpA 


Ribonuclease P Protein Component 






0504 


(CT397) 


vacB 


Ribonuclease Family 




10 


RNA Elongation dc Termination Factors 






0741 


(CT636) 


grcA 


Transcription Elongation Factor 






0316 


(CT097) 


nusA 


N Utilization Protein A 






0076 


(CT320) 


nusG 


Transcriptional Antiterminarion 






0845 


(CT704) 


pcnB_l 


Poly A PolymeraseJ 




15 


0966 


(CT410) 


pcnB_2 


PolyA Polymerase_2 






0610 


(CT491) 


rho 


Transcription Termination Factor 






RNA Methylases 










0674 


(CT553) 


fmu 


RNA Methyltransferase 






1059 


(CT354) 


kgsA 


Dimethyiadenosine Transferase 


n 


20 


0187 


(CT133) 




Predicted Methylase 






0530 


(CT403) 


spoUJ 


rRNA MethylaseJ 


W 




0660 


(CT540) 


spoU_2 


rRNA Methylase_2 




0117 


(CT027) 


tnnD 


tRNA (Guanine N-1)-Methyltransferase 






0885 


(CT742) 


ygcA 


rRNA Methy (transferee 


|~ik 


25 


0986 


(CT829) 


yggH 


Predicted rRNA Methylase 


is 




0987 


(CT830) 


ytgB 


Predicted rRNA Methyiase 


ill 




RNA Modification 










0649 


(CT530) 


fmt 


Methionyl tRNA Formyltransferase 






0910 


(CT766) 


miaA 


tRNA Pyrophosphate Transferase 
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0719 


(CT658) 


sfhB 


Predicted Pseudouridine Synthase 






0219 


(CT193) 


tgt 


Queuine tRNA Ribosyl Transferase 






0580 


(CT463) 


truA 


Pseudouridylate Synthase I 






0319 


(CT094) 


trufi 


tRNA Pseudouridine Synthase 






0403 


(CT106) 


yceC 


Predicted Pseudouridine Synthetase Family 
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0864 


(CT723) 


yjbc 


Predicted Pseudouridine Synthase 



RNA Polymerase & Transcription Regulators 



0586 


(CT468) 


atoC 


Two-Component Regulator 


0362 


(CT061) 


rpsD 


Sigma-28/WhiG Family 


0501 


(CT394) 


hrcA 


HTH Transcriptional Repressor 


0793 


(CT588) 


rbsU 


Sigma Regulatory Family Protein— PP2C Phosphatase (RsbW Antagonist) 


0626 


(CT507) 


rpoA 


RNA Polymerase Alpha 


0081 


(CT3I5) 


rpoB 


RNA Polymerase Beta 


0082 


(CT3I4) 


rpoC 


RNA Polymerase Beta' 


0756 


(CT6I5) 


rpoD 


RNA Polymerase Sigma-66 


0771 


(CT609) 


rpoN 


RNA Polymerase Sigma-54 


0511 


(CT424) 


rsbVj 


Sigma Regulatory FactorJ 


0909 


(CT765) 


rsbV_2 


Sigma Factor Regulator_2 




^(CT549)^ 


^-rsbW-— 




- 0670" 


- Sigma~Regulatory : Factor-HistidineK;inase _ _ ■ ■ - _t ■ 


0750 


(CT630) 


tctD 


HTH Transcriptional Regulatory Protein + Receiver Doman 


1069 


(CT009) 


yfgA 


HTH Transcriptional Regulator 



Amino Acyl tRNA Synthesis 
0892 (CT749) alaS 
5 5 0570 (CT454) argS 

0662 (CT542) aspS 



Translation 



Alanyl tRNA Synthetase 
Arginyl tRNA Transferase 
Asparryl tRNA Synthetase 
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0932 


(CT782) 


cysS 


Cystcinyl tRNA Synthetase 


0003 


(CT003) 


gatA 


Glu tRNA Gin Amidotransferase (A subunit) 


0004 


(CT004) 


gaiB 


Glu tRNA Gin Amidotransferase (B Subunit) 


0002 


(CT002) 


gatC 


Glu tRNA Gin Amidotransferase (C subunit) 


0560 


(CT445) 


gltX 


Glutamyl-tRNA Synthetase 


0946 


(CT796) 


giyQ 


Glycyl tRNA Synthetase 


0663 


(CT543) 


hisS 


Hisridyl tRNA Synthetase 


0109 


(CT0I9) 


ileS 


Isoleucyl-tRNA Synthetase 


0153 


(CT209) 


leuS 


Leucyl tRNA Synthetase 


0931 


(CT781) 


lysS 


Lysyl tRNA Synthetase 


0122 


(CT032) 


mctG 


Methionyl-tRNA Synthetase. 


0993 


(CT836) 


phcS 


Phenylalanyl tRNA Synthetase, Alpha 


0594 


(CT475) 


phcT 


Phenylalanyl tRNA Synthetase Beta 


0500 


(CT393) 


proS 


Prolyl tRNA Synthetase 


0870 


(CT729) 


serS 


Seryl tRNA Synthetase_2 


0806 


(CT581) 


thrS 


Thrconyl tRNA Synthetase 


0802 


(CT585) 


tipS 


Tryptophanyl tRNA Synthetase 


0361 


(CT062) 


tyrS 


Tyrosyl tRNA Synthetase 


0094 


(CT302) 


valS 


Valyl tRNA Synthetase 



Peptide Chain Initiation, Elongation d Termination 



1067 


(CT353) 


def 


Polypeptide Deformylase 


0184 


(CTI22) 


efp_l 


Elongation Factor P_l 


0895 


(CT752) 


efp_2 


Elongation Factor P_2 


0550 


(CT437) 


fusA 


Elongation Factor G 


0073 


(CT323) 


infA 


Initiation Factor IF- 1 


0317 


(CT096) 


infB 


Initiation Factor- 2 


0990 


(CT833) 


infC 


Initiation Factor 3 


0113 


(CT023) 


pfrA 


Peptide Chain Releasing Factor 1 


0576 


(CT459) 


prfB 


Peptide Chain Release Factor 2 


0950 


(CT800) 


pth 


Peptidyl tRNA Hydrolase 


0318 


(CT095) 


rbfA 


Ribosome Binding Factor A 


0699 


(CT677) 


rrf 


Ribosome Releasing Factor 


0697 


(CT679) 


tsf 


Elongation Factor TS 


0074 


(CT322) 


tufA 


Elongation Factor Tu 


Ribosomai Proteins 






0078 


(CT318) 


rll 


LI Ribosomai Protein 


0644 


(CT525) 


H2 


L2 Ribosomai Protein 


0647 


(CT528) 


rl3 


L3 Ribosomai Protein 


0646 


(CT527) 


r!4 


L4 Ribosomai Protein 


0635 


(CT516) 


rl5 


L5 Ribosomai Protein 


0633 


(CT514) 


r!6 


L6 Ribosomai Protein 


0080 


(CT316) 


rl7 


L7/LI2 Ribosomai Protein 


0953 


(CT803) 


rl9 


L9 Ribosomai Protein 


0079 


(CT317) 


rllO 


L10 Ribosomai Protein 


0077 


(CT319) 


rll I 


Ll 1 Ribosomai Protein 


0247 


(CT125) 


rll3 


LI3 Ribosomai Protein 


0637 


(CT518) 


rll4 


LI 4 Ribosomai Protein 


_ 0630 


(CT5II) 


_rl!5^__„ 


__L15 Ribosomai Protein _ _ 


0640 


(CT521) 


r!16 


LI 6 Ribosomai Protein 


0625 


(CT506) 


rll 7 


L17 Ribosomai Protein 


0632 


(CT513) 


r!18 


L18 Ribosomai Protein 


0118 


(CT028) 


rM9 


LI9 Ribosomai Protein 


0992 


(CT835) 


r!20 


L20 Ribosomai Protein 


0546 


(CT420) 


rl21 


L2 1 Ribosomai Protein 


0642 


{CT523) 


rl22 


L22 Ribosomai Protein 


0645 


(CT526) 


rl23 


L23 Ribosomai Protein 
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0636 


(CT517) 


rl24 


0545 


(CT419) 


r!27 


0327 


(CT086) 


ri28 


0639 


(CT520) 


rl29 


0112 


(CT022) 


rl31 


0961 


(CT810) 


rl32 


0250 


(CT150) 


rl33 


0935 


(CT785) 


r!34 


0991 


(CT834) 


r!35 


0936 


(CT786) 


rI36 


0315 


(CT098) 


rsl 


0696 


(CT680) 


rs2 


0641 


(CT522) 


rs3 


0733 


(CT626) 


rs4 


0631 


(CT512) 


rs5 


0951 


(CT801) 


rs6 


0551 


(CT438) 


rs7 


0634 


(CT515) 


rs8 


0246 


(CT126) 


rs9 


0549 


(CT436) 


rslO 


0627 


(CT508) 


rsl 1 


0552 


(CT439) 


rsl2 


0628 


(CT509) 


rsl3 


0937 


(CT787) 


rsl4 


1000 


(CT843) 


rsl5 


0116 


(CT026) 


rsl6 


0638 


(CT519) 


rsl7 


0952 


(CT802) 


rs!8 


0643 


(CT524) 


rsl9 


0754 


(CT617) 


rs20 


0031 


(CT342) 


rs21 



L24 Ribosomal Protein 
L27 ribosomal protein 
L28 Ribosomal Protein 
L29 Ribosomal Protein 
L3 1 Ribosomal Protein 
L32 Ribosomal Protein 
L33 Ribosomal Protein 
L34 Ribosomal Protein 
L35 Ribosomal Protein 
L36 Ribosomal Protein 

51 Ribosomal Protein 

52 Ribosomal Protein 

53 Ribosomal Protein 

54 Ribosomal Protein 

55 Ribosomal Protein 

56 Ribosomal Protein 

57 Ribosomal Protein 

58 Ribosomal Protein 

59 Ribosomal Protein 
S10 Ribosomal Protein 
Si 1 Ribosomal Protein 

512 Ribosomal Protein 

513 Ribosomal Protein 
SI 4 Ribosomal Protein 
SI 5 Ribosomal Protein 

516 Ribosomal Protein 

51 7 Ribosomal Protein 
SI 8 Ribosomal Protein 

519 Ribosomal Protein 

520 Ribosomal Protein 

521 Ribosomal Protein 



Other Categories 



Chlamydia-Specific Proteins 




0561 


(CT446) 


Euo 


CHLPS Euo Protein 


0804 


(CT583) 


Gp6D 


CHLTR Plasmid Paralog 


0186 


(CTI19) 




Similarity to IncA_l 


0291 


(CT232) 


incB 


Inclusion Membrane Protein B 


0292 


(CT233) 


incC 


Inclusion Membrane Protein C 


1026 


(CT377) 




LtuA Protein 


0333 


(CT080) 




Ltufi Protein 


0005 


(CT871) 


pmp_l 


Polymorphic Outer Membrane Protein G Family 


0013 


(CT871) 


pmp_2 


Polymorphic Outer Membrane Protein G Family 


0014 


(CT871) 


pmp_3 


Polymorphic Outer Membrane Protein G Family 


0015 


(CT871) 


pmp_3 


PMP_3 (frame-shift with 0014) 


0016 


(CT874) 


pmp_4 


Polymorphic Outer Membrane Protein G Family 


--0017- 


(CT871) 


"pmpRC ~ 


~PMP_4 ■(frame-shift wittvOOloy : 


0018 


(CT874) 


pmp_5 


Polymorphic Outer Membrane Protein G Family 


0019 


(CT871) 


pmp_5 


PMP_5 (frame-shift with 0018) 


0444 


(CT871) 


pmp_6 


Polymorphic Outer Membrane Protein G/I Family 


0445 


(CT871) 


pmp_7 


Polymorphic Outer Membrane Protein G Family 


0446 


(CT871) 


pmp_8 


Polymorphic Outer Membrane Protein G Family 


0447 


(CT87I) 


pmp_9 


Polymorphic Outer Membrane Protein G/I Family 


0450 


(CT871) 


pmplO 


Polymorphic Outer Membrane Protein G Family 


0449 


(CT871) 


pmp_10 


PMPJ0 (Frame-shift with 0450) 
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0451 


(CT871) 


pmp_ 


1 1 


Polymorphic Outer Membrane Protein G Family 


0452 


(CT874) 


pmp_ 


12 


Polymorphic Outer Membrane Protein (truncated) A 


0453 


(CT871) 


pmp. 


13 


Polymorphic Outer Membrane Protein G Family 


0454 


(CT872) 


pmp_ 


,' 4 


Polymorphic Outer Membrane Protein H Family 


0466 


(CT869) 


pmp_ 


15 


Polymorphic Outer Membrane Protein E Family 


0467 


(CT869) 


pmp_ 


16 


Polymorphic Outer Membrane Protein E Family 


0468 


(CT869) 


pmp_ 


17 


Polymorphic Outer Membrane Protein E Family 


0469 


(CT869) 


pmp_ 


17 


PMPJ 7 (Frame-shift with 0468) 


0470 


(CT869) 


pmp_ 


17 


PMPJ 7 (Frame-shift with 0469) 


0471 


(CT870) 


pmp_ 


18 


Polymorphic Outer Membrane Protein E/F Family 


0539 


(CT4I2) 


pmp_ 


19 


Polymorphic Membrane Protein A Family 


0540 


(CT413) 


pmp_ 


20 


Polymorphic Membrane Protein B Family 


0963 


(CT812) 


pmp_ 


21 


Polymorphic Membrane Protein D Family 


0562 








CHLPS 43 kDa Protein HomologJ 


0927 








CHLPS 43 kDa Protein Homolog_2 


0928 








CHL'S 43 kDa Protein HomologJ 


0929 








CHL'S 43 kDa Protein HomologJ 


0728 


(CT622) 






CHL'N 76kDa HomologJ (CT622) 


0729 


(CT623) 






CHLPN 76kDa HomologJ (CT623) 


0133 


(CT109) 






CHLJ'S Hypothetical Protein 


0332 


(CT081) 






CHL " R T2 Protein 


Hscellaneous Enzymes/Conserved Prou ins 


0193 




argR 




Possi le Arginine Repressor 


1046 








Aron atic Amino Acid Hydroxylase 


0232 








Similarity to 5'-Methylthioadenosine Nucleosidase 


0128 


(CT035) 






Biotin Protein Ligase 


0513 


(CT426) 






Fe-S OxidoreductaseJ 


0911 


(CT767) 






Fe-S Oxidoreductase_2 


0373 


(CT057) 


gcpE 




GcpE Protein 


0407 


(CTI03) 






HAD Superfamily Hydrolase/Phosphatase 


0917 


(CT77I) 






Hydrolase/Phosphatase Homolog 


0488 


(CT385) 


ycfF 




HIT Family Hydrolase 


0701 


(CT675) 


karG 




Arginine Kinase 


0526 


(CT399) 


kpsF 




GutQ/KpsF Family Sugar-P Isomerase 


0919 


(CT773) 


Idh 




Leucine Dehydrogenase 


0022 


(CT349) 


maf 




Maf protein 


0997 


(CT840) 


mcsJ 




PP-loop superfamily ATPase 


0151 


(CT148). 


mhpA 




Monooxygenase 


0730 


(CT624) 


mviN 




Integral Membrane Protein 


0861 


(CT720) 






NifU-Related Protein 


0479 


(CT380) 


phnP 




Metal Dependent Hydrolase 


0106 


(CT015) 


phoH 




ATPase 


0329 


(CT084) 






Phopholipase D Superfamily 


0435 


(CT284) 






Phospholipase D Superfamily 


0581 


(CT464) 






Phosphoglycolate Phosphatase 


0897 


(CT754) 






Predicted Phosphohydrolase 


0509 


(CT422) 






Predicted Metalloenzyme 


1030 


(CT375) 






Predicted D-Amino Acid Dehyrogenase 


0531 


(CT404) 






SAM Dependent Methyltransferase 


0337 


(CT076) 


smpB 




Small Protein B 


0394 


(CT256) 


tlyCj 




CBS Domain Protein (Hemolysin Homolog) J 


0510 


(CT423) 


tlyC_2 




CBS Domains (Hemolysin Homolog) J 


0382 


(CT048) 


yabC 




SAM- Dependent Methytransferase 


0787 


(CT594) 


yabD 




PHP Superfamily (Urease/Pyrimidinase) Hydrolase 


0611 


(CT492) 


yacE 




Predicted Phosphatase/Kinase 


0579 


(CT462) 


yacM 




Sugar Nucleotide Phosphorylase 


0578 


(CT461) 


yacl 




Phosphohydrolase 



65 



0345 


(CT07I) 


yacM 


CT071 Hypothetical Protein 


0566 


(CT450) 


yaeS 


YaeS family Hypothetical Protein 


0591 


(CT472) 


yagE 


YagE family 


0039 


(CT335) 


ybaB 


YbaB family Hypothetical Protein 


0101 


(CT012) 


ybbP 


YbbP family Hypothetical Protein 


0915 


(CT769) 


ybeB 


iojap Superfamily Ortholog 


0137 


(CT108) 


ybgl 


ACR family 


0529 


(CT402) 


ycaH 


ATPase 


0438 


(CT287) 


ycbF 


PP-loop Superfamily ATPase 


0734 


(CT627) 


yceA 


YceA Hypothetical Protein 


0954 


(CT804) 


ychB 


Predicted Kinase 


0261 


(CT217) 


ydaO 


PP-Loop Superfamily ATPase 


0245 


(CT127) 


ydhO 


Polysaccharide Hydrolase-Invasin Repeat Family 


0573 


(CT457) 


yebC 


YebC Family Hypothetical Protein 


0689 


(CT687) 


yfhOJ 


NifS-related Aminotransferase ! 


0862 


(CT721) 


yfhO_2 


NifS-related Aminotransferase^ 


0547 


(CT434) 


ygbB . 


YgbB Family Hypothetical Protein 


0237 


(CT184) 


yggf 


YggF Family Hypothetical Protein 


0775 


(CT606) 


ygg v 


YggV Family Hypothetical Protein 


0396 


(CT258) 


yhfOJ 


NifS-related Amino transferase_3 


0605 


(CT487) 


yhhF 


Predicted Methylase 


0575 


(CT458) 


yhhY 


Amino Group Acetyl Transferase 


0592 


(CT473) 


yidD 


YidD Family 


0982 


(CT825) 


yigN 


YigN Family Hypothetical Protein 


0657 


(CT537) 


yjeE 


YjeE Hypothetical Protein 


0768 


(CT644) 


yohl 


Yohl Predicted Oxidoreductase 


0336 


(CT077) 


yojL 


YojL Hypothetical Protein 


0217 


(CT140) 


ypdP 


YpdP Hypothetical Protein 


0140 


(CT212) 


yqdE 


YqdE Hypothetical Protein 


0263 


(CT221) 


yqfU 


YqfU Hypothetical Protein 


0139 


(CT211) 


yqgE 


YqgE Hypothetical Protein 


0270 


(CT137) 


ywlC 


SuA5 Superfamily-related Protein 


0879 


(CT738) 


yycJ 


Metal Dependent Hydrolase 








Homologs to CHLTR Hypothetic 


0001 


(CT001) 


CT001 Hypothetical Protein 


0020 


(CT351) 


CT351 Hypothetical Protein 


0021 


(CT350) 


CT350 Hypothetical Protein 


0026 


(CT345) 


CT345 Hypothetical Protein 


0035 


(CT339) 


CT339 Hypothetical Protein 


0036 


(CT338) 


CT338 Hypothetical Protein 


0055 


(CT296) 


CT296 Hypothetical Protein 


0062 


(CT289) 


CT289 Hypothetical Protein 


0065 


(CT288) 


CT288 Hypothetical Protein 


0068 


(CT360) 


CT360 Hypothetical Protein 


0071 


(CT325) 


CD 25 Hypothetical Protein 


0072 


(CT324) 


CT324 Hypothetical Protein 


0085 


(CT311) 


CT3 1 1 Hypothetical Protein 


0087 


(Cf309)" 


CT309 Hypothetical Protein 


0093 


(CT303) 


CT303 Hypothetical Protein 


0100 


(CT011) 


CT01 1 Hypothetical Protein 


0104 


(CT017) 


CT0I7 Hypothetical Protein 


0105 


(CT016) 


CT0 16 Hypothetical Protein 


0107 


(CT058) 


CT058 Hypothetical ProteinJ 


0108 


(CT018) 


CT0I8 Similarity 


0111 


(CT021) 


CT02 1 Hypothetical Protein 


0121 


(CT031) 


CT03 1 Hypothetical Protein 



0129 (CT036) 

0145 (CT114) 

0150 (CT147) 

0152 (CT149) 

0176 (CT153) 

0188 (CT132) 

0189 (CT131) 
0206 (CT203) 

0229 (CT178) 

0230 (CTI79) 
0234 (CTI81) 
0249 (CT151) 

0253 (CT144) 

0254 (CT143) 

0255 (CT142) 

0256 (CT144) 

0257 (CT143) 
0259 (CTU2) 
0276 (CTI9I) 
0288 (CT195) 
0293 (CT234) 
0301 (CT242) 
0303 (CT244) 
0308 (CT249) 
0312 (CT10I) 
0328 (CT085) 

0330 (CT083) 

0331 (CT082) 
0334 (CT079) 

0342 (CT073) 

0343 (CT073) 
0350 (CT066) 

0369 (CT058) 

0370 (CT058) 
0374 (CT056) 
0379 (CT053) 
0381 (CT326) 
0383 (CT047) 
0387 (CT043) 
0389 (CT04I) 
0393 (CT038) 
0395 (CT257) 

0399 (CT253) 

0400 (CT254) 

0401 (CT255) 
0405 (CT105) 

0408 (CT102) 

0409 ( CT260) 

041 1 (CT262) 

0412 (CT263) 
0415 (CT266) 
0420 (CT271) 

0422 (CT273) 

0423 (CT274) 

0425 (CT276) 

0426 (CT277) 
0434 (CT283) 



CT036 Similarity 
CT114 Hypothetical Protein 
CT147 Hypothetical Protein 
CT149 Hypothetical Protein 
CT153 Hypothetical Protein 
CT132 Hypothetical Protein 
CT131 Hypothetical Protein 
CT203 Hypothetical Protein 
CT178 Hypothetical Protein 
CT179 Hypothetical Protein 
CT181 Hypothetical Protein 
CT151 Hypothetical Protein 
CT144 Hypothetical ProteinJ 
CT143 Hypothetical Protein_I 
CT142 Hypothetical Protein J 
CT144 Hypothetical Protein_2 
CT143 Hypothetical Protein_2 
CT142 Hypothetical Protein_2 
CT191 Hypothetical Protein 
CT195 Hypothetical Protein 
CT234 Hypothetical Protein 
CT368 Hypothetical Protein 
CT244 Hypothetical Protein 
CT249 Similarity 
CT101 Hypothetical Protein 
CT085 Hypothetical Protein 
CT083 Hypothetical Protein 
CT082 Hypothetical Protein 
CT079 Similarity 
CT073 Hypothetical Protein 
(frame-shift with 0342?) 
CT066 Hypothetical Protein 
CT058 Hypothetical Protein_2 
CT058 Hypothetical Protein_3 
CT056 Hypothetical Protein 
CT053 Hypothetical Protein 
CT326 Similarity 
CT047 Hypothetical Protein 
CT043 Hypothetical Protein 
CT041 Hypothetical Protein 
CT038 Hypothetical Protein 
CT257 Hypothetical Protein 
CT253 Hypothetical Protein 
CT254 Hypothetical Protein 
CT255 Hypothetical Protein 
CT105 Hypothetical Protein 
CT102 Hypothetical Protein 
CT260 Hypothetical PTOtein 
CT262 Hypothetical Protein 
CT263 Hypothetical Protein 
CT266 Hypothetical Protein 
CT271 Hypothetical Protein 
CT273 Hypothetical Protein 
CT274 Hypothetical Protein 
CT276 Hypothetical Proteins 
CT277 Similarity 
CT283 Hypothetical Protein 



0441 


(CT007) 


0442 


(CT006) 


0443 


(CT005) 


0474 


(CT365) 


0476 


(CT865) 


0480 


(CT383) 


0485 


(CT382) 


0487 


(CT384) 


0489 


(CT386) 


0490 


(CT387) 


0491 


(CT389) 


0496 


(CT39I) 


0497 


(CT388) 


0506 


(CT421) 


0507 


(CT42I) 


0508 


(CT421) 


0512 


(CT425) 


0514 


(CT427) 


0518 


(CT429) 


0522 


(CT433) 


0525 


(CT398) 


0533 


(CT406) 


053 7 ' J 


(CT814) 


0538 


(CT814) 


0554 


(CT440) 


0559 


(CT441) 


0565 


(CT449) 


0572 


(CT456) 


0582 


(CT465) 


0583 


(CT466) 


0588 


(CT469) 


0589 


(CT470) 


0590 


(CT471) 


0593 


(CT474) 


0595 


(CT476) 


0601 


(CT483) 


0602 


(CT484) 


0606 


(CT488) 


0609 


(CT490) 


0622 


(CT503) 


0623 


(CT504) 


0648 


(CT529) 


0658 


(CT538) 


0667 


(CT546) 


0668 


(CT547) 


0669 


(CT548) 


0671 


(CT550) 


0673 


(CT552) 


0675" 


(CT696) 


0676 


(CT695) 


0681 


(CT691) 


0687 


(CT482) 


0688 


(CT481) 


0700 


(CT676) 


0705 


(CT67I) 


0706 


(CT670) 


0708 


(CT668) 



CT007 Hypothetical Protein 
CT006 Hypothetical Protein 
CT005 Hypothetical Protein 
CT365 Hypothetical Protein 
CT865 Hypothetical Protein 
CT383 Hypothetical Protein 
CT382.1 Hypothetical Protein 
CT384 Hypothetical Protein 
CT386 Hypothetical Protein 
CT387 Hypothetical Protein 
CT389 Hypothetical pTotein 
CT391 Hypothetical Protein 
CT388 Hypothetical Protein 
CT421 Hypothetical Protein 
CT421.1 Hypothetical Protein 
CT42 1 .2 Hypothetical Protein 
CT425 Hypothetical Protein 
CT427 Hypothetical Protein 
CT429 Hypothetical Protein 
CT433 Hypothetical Protein 
CT398 Hypothetical Protein 
CT406 Hypothetical Protein 
CT814.1 Hypothetical Protein 
CT8I4 Hypothetical Protein 
CT440 Hypothetical Protein 
CT441.1 Hypothetical Protein 
CT449 Hypothetical Protein 
CT456 Hypothetical Protein 
CT465 Hypothetical Protein 
CT466 Hypothetical Protein 
CT469 Hypothetical Protein 
CT470 Hypothetical Protein 
CT471 Hypothetical Protein 
CT474 Hypothetical Protein 
CT476 Hypothetical Protein 
CT483 Hypothetical Protein 
CT484 Hypothetical Protein 
CT488 Hypothetical Protein 
CT490 Hypothetical Protein 
CT503 Hypothetical Protein 
CT504 Hypothetical Protein 
CT529 Hypothetical Protein 
CT538 Hypothetical Protein 
CT546 Hypothetical Protein 
CT547 Hypothetical Protein 
CT548 Hypothetical Protein 
CT550 Hypothetical Protein 
CT552 Hypothetical Protein 
CT696 Hypothetical Protein" 
CT695 Similarity 
CT69I Hypothetical Protein 
CT482 Hypothetical Protein 
CT481 Hypothetical Protein 
CT676 Hypothetical Protein 
CT671 Hypothetical Protein 
CT670 Hypothetical Protein 
CT668 Hypothetical Protein 





0709 


(CT667) 


CT667 Hypothetical Protein 




0710 


(CT666) 


CT666 Hypoihetical Protein 




0711 


(CT665) 


CT665 Hypothetical Protein 




0713 


(CT663) 


CT663 Hypothetical Protein 


5 


0717 


(CT656) 


CT656 Hypothetical Protein 




0718 


(CT657) 


CT657 Hypothetical Protein 




0720 


(CT659) 


CT659 Hypothetical Protein 




0722 


(CT654) 


CT654 Hypothetical Protein 




0725 


(CT652) 


CT652.1 Hypothetical Protein 


10 


0726 


(CT620) 


CT620 Hypothetical Protein 




0727 


(CT6I9) 


CT619 Hypothetical Protein 




0739 


(CT638) 


CT368 Hypothetical Protein 




0742 


(CT635) 


CT635 Hypothetical Protein 




0746 


(CT632) 


CT632 Hypothetical Protein 


15 


0747 


(CT631) 


CT631 Hypothetical Protein 




0751 


(CT651) 


CT651 Hypothetical Protein 




0755 


(CT616) 


CT616 Hypotheti:al Protein 




0760 


(CT611) 


CT611 Hypothetic! Protein 




0761 


(CT610) 


CT610 Hypotheti:aI Protein 


20 


0764 


(CT648) 


CT648 Hypothetical Protein 




0765 


(CT647) 


CT647 Hypotheti :al Protein 




0766 


<CT646) 


CT646 Hypotherii at Protein 




0767 


(CT645) 


CT645 Hypotheti al Protein 




0770 


(CT642) 


CT642 Hypotheti ;al Protein 


25 


0774 


(CT606) 


CT606.1 Hypothetical Protein 




0776 


(CT605) 


CT605 Hypothetical Protein 




0779 


(CT602) 


CT602 Hypothetical Protein 




0783 


(CT598) 


CT598 Hypoihetical Protein 




0791 


(CT590) 


CT590 Hypothetical Protein 


30 


0792 


(CT589) 


CT589 Hypothetical Protein 




0803 


(CT584) 


CT584 Hypothetical Protein 




0807 


(CT580) 


CT580 Hypothetical Protein 




0808 


(CT579) 


CT579 Hypothetical Protein 


35 


0809 


(CT578) 


CT578 Hypothetical Protein 


0810 


(CT577) 


CT577 Hypothetical Protein 




0814 


(CT573) 


CT573 Hypothetical Protein 




0818 


(CT569) 


CT569 Hypoihetical Protein 




0819 


(CT568) 


CT568 Hypothetical Protein 


40 


0820 


(CT567) 


CT567 Hypothetical Protein 


0821 


(CT566) 


CT566 Hypothetical Protein 




0822 


(CT565) 


CT565 Hypothetical Protein 




0827 


(CT560) 


CT560 Hypothetical Protein 




0834 


(CT556) 


CT556 Hypothetical Protein 


45 


0840 


(CT700) 


CT700 Hypothetical Protein 


0842 


(CT702) 


CT702 Hypothetical Protein 




0843 


(CT702) 


CT702 Hypothetical Protein 




0852 


(CT711) 


CT71 1 Hypothetical Protein 




0853 


(CT712) 


CT712 Hypothetical Protein 


-. 


0857 


"(CT716) 


CT716 Hypothetical Protein ~ 


50 


0859 


(CT718) 


CT718 Hypothetical Protein 




0865 


(CT724) 


CT724 Hypothetical Protein 




0869 


(CT728) 


CT728 Hypothetical Protein 




0874 


(CT733) 


CT733 Hypothetical Protein 


55 


0875 


(CT734) 


CT734 Hypothetical Protein 


0884 


(CT741) 


CT741 Hypothetical Protein 




0887 


(CT744) 


CHLTR Possible Phosphoprotein 




0896 


(CT753) 


CT753 Hypothetical Protein 



0906 


(CT763J 


CT763 Hypothetical Protein 


0908 


(CT764) 


CT764 Hypothetical Protein 


0912 


(CT768) 


CT768 Hypothetical Protein 


0925 


(CT779) 


CT779 Hypothetical Protein 


0938 


(CT788) 


CT788 Hypothetical Protein 


0939 


(CT790) 


CT790 Hypothetical Protein 


0943 


(CT794) 


CT794.1 Hypothetical Protein 


0945 


(CT795) 


CT795 Hypothetical Protein 


0956 


(CT805) 


CT805 Hypothetical Protein 


0960 


(CT809) 


CT809 Hypothetical Protein 


0989 


(CT832) 


CT832 Hypothetical Protein 


0994 


(CT837) 


CT837 Hypothetical Protein 


0995 


(CT838) 


CT838 Hypothetical Protein 


0996 


(CT839) 


CT839 Hypothetical Protein 


1002 


(CT845) 


CT845 Hypothetical Protein 


1003 


(CT846) 


CT846 Hypothetical Protein 


1004 


(CT847) 


CT847 Hypothetical Protein 


1005 


(CT848) 


CT848 Hypothetical Protein 


1006 


(CT849) 


CT849 Hypothetical Protein 


1007 


(CT849) 


CT849. 1 Hypothetical Protein 


1008 


(CT850) 


CT350 Hypothetical Protein 


1010 


(CT852) 


CT852 Hypothetical Protein 


1011 


(CT853) 


CT853 Hypothetical Protein 


1015 


(CT857) 


CT857 Hypothetical Protein 


1016 


(CT858) 


CT858 Hypothetical Protein 


1019 


(CT860) 


CT860 Hypothetical Protein 


1020 


(CT86I) 


CT861 Hypothetical Protein 


1022 


(CT863) 


CT863 Hypothetical Protein 


1032 


(CT373) 


CT373 Hypothetical Protein 


1033 


(CT372) 


CT372 Hypothetical Protein 


1034 


(CT371) 


CT371 Hypothetical Protein 


1057 


(CT356) 


CT356 Hypothetical Protein 


1058 


(CT355) 


CT355 Hypothetical Protein 


1061 


(CT330) 


CT330 Hypothetical Protein 


1073 


(CT371) 


CT371 Hypothetical Protein 



Coding Genes Not in C trachomatis 
0486 Hypothetical Proline Permease 

0279 Possible ABC Transporter Permease Protein 

0505 3-Methyladenine DNA Glycosylase 

0193 argR Similarity to Arginine Repressor 

1041 bio A Adenosylmethionine-8-Amino-7-Oxononanoatc Aminotransferase 

1044 bioB Biotin Synthase 

1042 bioD Dethiobiotin synthetase 
0585 Similarity to Cps IncA_2 

0562 CHLPS 43 kDa Protein HomologJ 

0927 CHLPS 43 kDa Protein Homolog_2 

0928 CHLPS 43 kDa Protein Homolog_3 
" 0929 CHLPS 43 kDa Protein ~HomoTogj* 

1045 Conserved Hypothetical Membrane Protein 
0251 Conserved Hypothetical Protein 

0278 Conserved Outer Membrane Lipoprotein Protein 

0907 CutA-like Pcriplasmic Divalent Cation Tolerance Protein 

0171 guaA GMP Synthase 

0172 guaB Inosine 5'-Monophosphase Dehydrogenase 
0608 Uridine S'-Monophosphate Synthase 
0735 Uridine Kinase 



70 



0980 Similar to Saccharomyces cerevisiae 5 2. 9 K Da Protein 

0232 Similarity to S'-Methylihioadenosine Nucleosidase 

1046 Tryptophan Hydroxylase 

0477 yqeV_Bs Conserved Hypothetical Protein 

0048 yqfF-Bs Conserved Hypothetical IM Protein 

0587 yvyD_Bs Conserved Hypothetical Protein 

0143 yxjG_Bs_l Conserved Hypothetical Protein 

0448 yxjG_Bs_2 Conserved Hypothetical Protein 



10 



15 



20 



if?! 

25 

IS 
111 

It 30 

!=& 

m 35 



40 



45 



50 



55 



0006 


0180 


0440 


0977 


0007 


0181 


0455- 


0978 


0008 


0190 


0456 


1018 


0009 


0203 


0457 


1023 


0010 


0204 


0458 


1027 


0011 


0205 


0459 


1029 


0012 


0209 


0460 


1040 


0028 


0210 


0461 


105! 


0029 


0211 


0462 


1052 


0034 


0212 


0463 


1053 


0041 


0213 


0464 


1054 


0042 


0214 


0465 


1055 


0043 


0215 


0472 


1056 


0044 


0216 


0473 


1064 


0045 


0218 


0481 


1065 


0046 


0220 


0483 


1066 


0047 


0221 


0492 


1070 


0049 


0222 


0493 


1071 


0050 


0223 


0494 


1072 


0051 


0224 


0498 




0063 


0225 


0499 




0064 


0226 


0516 




0066 


0233 


0517 




0067 


0240 


0523 




0069 


0241 


0524 




0070 


0242 


0553 




0099 


0243 


0574 




0124 


0266 


0600 




0125 


0267 


0656 




0126 


0268 


0664 




0130 


0277 


0677 




0131 


0283 


0678 




0132 


0284 


0685 




0142 


0285 


0686 




0146 


0287 


0724 




0147 


0352 


0731 




0155 


0353 


0745 




0156 


0354 


0753 




0157 


0355 


0794 




0158 


0356 


0795 




^0159— 


-—0357- — 


_ 0796 - 




0162 


0358 


0797 




0163 


0365 


0798 




0164 


0366 


0799 




0165 


0367 


0829 




0166 


0368 


0830 




0167 


0371 


0831 




0168 


0372 


0881 




0169 


0375 


0882 





71 



0170 
0173 
0174 
0175 

5 0177 
0178 
0179 



0376 
0391 
0398 
0404 
0431 
0432 
0439 



0913 
0914 
0930 
0944 
0964 
0975 
0976 



O 
^0 
M 
W 
CO 

I* 

05. 

If! 

U 

14 



Table 3 



Chlamydia pneumoniae Qename Encoded Prl 

CPn.onoi no 4 

CT001 hypothetical protein 

KRLKDE IKYTSLRRKAMLCK I IRGLSSLIVILCALNVGLIGITHNKLNI IAKLCGGVSTP 
ATQ [TY r I IGIAGV ICLLSFCPFC3KK3RHSHGD3CSSGGCHSHHSDKN 
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MHVVNVEDLREDS VTS DFNREEFLRNV PES LGGLVKV PAV I K 

CPn_0O03 889 2370 

aatA-Glu tRNA Gin Amidotrans ferae • 

KIMYRYSALELAKAVTLGELTATGVTQHFFHRIEEAEXX}VGAFISL^^ 

KRSRGE PIXIKLAGVPVGI KDNI HVTGLKTTCASRVLE^QPPFDATV^RI^ 

KLNMDEFAMGSTTLY S AFHPTHNPWDLS RVPGGSSGGSAAAVSARFC PVALG SDTGGS I R 

Q p AAFCGWGFKPSYGAVSRYG LVAFAS SLDQ IG PLANTVEDVALMMDVFSGRDPKDATS 

REFFRDSFMSKLSTE^PKVIGVPRTFLEGLRDDIRENFFSSLAIFEGEGTHLVDVELDIL 

SHAVSIYTILASAEAATNLARFKrVRYGyRSPQAHTISQLYTLSRGEGFGKEVKRilll^ 

rr^SAERQNVYYKKATAVRAKIVKAFRTAFEKCEILAMPVCSSPAFEIGEILDPVTLYL 

QDIYTVAMNU^YLPAIAVPSGFSKBGLPLGLQIIGQOGQDOQVCQVGYSFQEHAQIKQLF 

SKRYAKSWLGGQS 

CPn 0004 2334 3833 . 

gatB-(Petll2> Glu tRNA Gin Amidotransferase (B Subunit) 
EICQKCCSRRSI MS AVYADWES V I G L EVHVELNT AS KLF S S ALNRFGDE PNTN I STVCTG 
LPGSLPVLNQSAVEKAVLFGCAVEGEISLLSRFDRKSYFYPDSPRNFQITQFEHPIIRGG 
RIKAIVQGEERYFEIAG/THIEDDAGMLK^FGEFAGVDYNRAGVPLIEIVSKPCMFCPEDA 
VAYATSLVSLLDYIGI SDCNMEEGS IRFDVNVSVRPKGSPELRNKVEIKNMNSFAFMAQA 
LEAEKQRQ I DEYLNQPNKDPKLVI PAATYRWDPEKKKTVLMRLKESAEDYKYFPEPDLPT 
LOLTESYIERIRKTLPELPYDKYHRYIQEYGLSEDIASILISDKNIATFFEVACKDCKNF 
RSLSNWVTVEFGGRCKTLGVKLPSSGIFPEGVAQLVNAIDQGVITGKIA^ 
GKNPEEILKEKPELLPMSDEGELQKI I AEWLANPES IVDYKNGKTKALGFLVGQIMKRT 
AGKAPPKSVNELLLLELDKG 

CPn_0005 4097 6892 

pmp_l- Polymorphic Outer Membrane Protein 

SDIHFDLGTKMRFSIXGFPLVFSFTIXSVFI?rSI^ATTISLTPEDSFHGDSQNAERSYNV 
QAGDVYSLTGDVSIS^NSAUJKACFT^SGSVTFAGNHHGLYFNNISSGTTKE<^^ 
CQj^P^ATARFSGFSTLSFIQSPGDIKEQGCLYSKNALMLLNNYVVRFEQNQSKTKGGAIS 
GAli&tri VGNYDS VSFYQNAATFGGA I HS SGPLQ I AVNQAE IRFAQNTAKNGSGGALYSDG 
~ I bEpQNAYVLFRENEALTTA I GKGGAVCCLPTSGS STPVP I VTFSDNKQLVFERNHS IM 
-&k YARKLS I SSGGPTLF INNI SYANSQNLGGAI AIDTGGEISLSAEKGTITFQGNRT 
SLEFiJ^GIHLLQNAKFLKLQARJO'SIEFYDPITSEA^ 
SGE^LANDPRDFKSTIPQNV^SAGYLVIKEGAEVWSKTTQSPG^^ 
KEDIAITGLAIDIDSLSSSSTAAVIKANTA^QISVTDSIELISPTGNAYEDIJIMRNSQT 
FPliiSiSLEPGAGGSVTVTAGDFLPVSPHYGFO^ 

KEGNILVPN I LWGNAVDVRS LMQVQET HAS S LQTDRGLWI DG IGNFFHVS AS EDN I RYRHN 
SGGYVLSVNNE I T PKHYTSMAF SQLF SRDKDY AVSNNEYRMYLGSYL YQ YTTSLGN I FRY 
AS RKPNVbJVG I LS RRF LQN PLM I FHF LC AYGHATNDMKT DY ANFPMVKNSWRNNCWAI EC 
GGS^LLWE^RLFO^AIPFMKl^LWAYQGDFKETTADGRRFSNGSLTSISVPLGIRF 
EKlAtSQDVLYDFSFSY I PD I FRKDPSC EAALVI SGDSWLVPAAHVSRHAFVGSGTGRYH 
FNDYTELLCRGS I ECRPHARNYNINCGSKFRF 

■ Ui 

CPru0006 7299 7141 

No -robust homolog present in Genebank/EMBL as of 11/7/98 
KQl^PLRSALLERLSEWLVLLGVPSPETTRSTPEKDANQLPKDSRNRTLESL 

CPryj007 7488 10496 

No 'robust homolog present in Genebank/EMBL as of 11/7/98 
KsiR-YNLSLIFSFLWIPLTDSTTSSLSTSLLDEGNPQSMRKIJlII^IVLIALSIILIA^ 
g\aLltvai PGLSSVISSPAGMGACALGCVMJ^IDVLUCKREVPIVLASVTTTP 
PRSGi S I SG ADST I RS LPTY LLDEGH PQSMRKLR I LAI VLI VF S 1 1 L I ASGWLLTVAI P 
G L S I S S P AGMG AC ALGCVMLALG I DVL LKK R EVP I VLASVTTT PGTGSPRSGISI^GA 
DST-4isLPTYPLDEGHPQSMRKLRILAIVLIVFSI ILIASGWLLTVAI PGLSSII/SPA 
EMGA£ ALGCVMLALG I DVLLKKREVP I WPAP I PEEW I DDI DEES I RLQQE. * " 

PEfifcSAF EG Y I KWES HLENMKSLPY DGHGLEEKTKHQ I RWRSSLKAMVPEFLDfrRRI F 
EEEEFF FLS ARKRLI DLATTLVERK I LTEQLERNNLRKAFSYLYQDS I FKK 1 1 DNF EKLA 
WKFMILSKS ICRFTI IFENHEHGVAXSLLHKNAVLLEKVIYRSI^KSYRDIGMJSSAKMKI 

lhgnpffslednkktimkehae>ileslssyrkvflalsdenvvdtpsdpkk™lsg I PCR 
dalseisrdeqwokkahlkhoeslytoardrltdqsskenqkele^aeqey/sswervkk 
fe I ERVQER I raiqklypn i lereeettgqetvtptvqgttassdltdi lqrievs sred 
.nqnqescwvlrshevemswevkqeygpkkkefqdqmgslerfftehieelevlqkdysk 
hlsyfkkvnnkkevqyakfrlkvlesdlegilaqtesaeslltqeelpi/atrgalekav 
fkgslccalaskakpyfeedprfqdsdtqlraltlrloeakasleeer/rfsnlendiae 
errllkeskotferaglgvlreiavestydlrsltnttoegtpesekv/fsmylnyyneek 
rraktrlvemtqryrdfkmaleamqfneeallqeelsiqapse 

CPnJ>008 10780 11685 

No robust homolog present in Genebank/EMBL a£ ot 11/7/98 
CKYSYLLWPPPPRRSL^VSCSKLRSLSITLLVLGVLLLTLGireLTAGISFCAGLGFSA 
LGGVLVISGLLFLLVRREVPTVRSEEIPRGVSVTPSEEPALEtfAQKEPETKKILDRLPKE 
LDQLDTYIQEVFACLERLKDPKYEDRGLLTEAKEKLRVFDVyEKDMMSEFLDIQRVLNEE 
AYYVEHCQDPLEN IAYEI FSSOELRDYYCAGVCGYLPSGDAfiADRLKRSVKEVMDRFMRV 
TWftSWEASVMLDHSYGVARELFKKAVGVLEESVYKILFKS/RDAFYECEKAKIQRDGRPK 
WL 

CPn_OOU0 LI 689 13 m 

No robust homo Ion present in r,eneb.ink^EMBL ds ot ll/7/'.i8 
-f/r:;AHAF.01? FRD tNTICWEDLKOTl FWVCEHDCTD rafvRKSCMWtDRYADKF ILREKEEK- 

merhelfiiat^rkasciiayakakmfekercnencrkvkdvekwlskolaefrnqesrr 
arerlr^lotlypevsv[-:ervlerortkkvnlenkyadiekkyhiicvreoeh-/wkevenk 
kaeyremjekvlliaeevskclqrledclbtw^^ 

vtnrle r u:f.dal:em [ FR I EE I EMTLRMVELP w.fmkntfekaslqyn.sckemuxkvepq 

^rEGPT7RS:;uFIU..ERLNOni^rAYTNCOERlA:F;;DLE3KVRTCRDHLRE0MKI(FEV0G 
I J-JF rNRl-XE.WVi lARLPTQAlU.DLVAl'VPYMEf YLOYIITI [KREKVR;IOWMAKTERYREIRQ 
Al-'fjt IVMKFDt.t.AKDT I |jKF.F.iyiMVLl.RDDWLl£RDERKNRORRI. K'NK [AAAOOKVKGF 



CK YFY LRSYPPP PQHSVG5 If^^HRVLA [TFLVFGMLLL 1 :3CAL FLXLC I PGLSAA I S 
rcLGIGL^AUX^ISGLL^P^Eir-rVRPEEIPEGVSLAPSEEPALQAAOKTLAOL 
PKELDQLDTD IQEVFACLRKlTdSKYESRS FLNDAKKELRVFDFWEDTL3 E I FELRQ I V 
AQEC^LNFLINGGRSUefTAESESLDLFHVSKRLGYLPSGDV^CEGLKKSAKEIVARLM 
SLHCEIHKVAVAFDRNSYAMAEKAFAKALGALE^SVYRSLTQSYPDKFLESERAKIPWNG 
H rTWLRDDAKSCCAEKKLGMPRNVGRNLGKQSF^ 



CPn_00!0 . 1 



L426P 



1574/ 
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FLKRIJlRKCALAiCrTFEKKF^KKNWAVHfEANARRLKW 

YPEVSVSIRENKIQETRSNLEKAYEAIDENYRCCVHEQEDYWKEEEKREAEFRERG>^IL 
SPEEI^SLEQFDHGLKNFSEKl^ELE^HIUCLQKEATAEVENKILSDAESRLEIVFEDV 
KEMPCRIEEIEKTLRMAELPLI.PTKKAFEKACSQYNSCAEWLEKVKPYCKESLAYVTSKE 
RLVSLDEDLRi^YTErQKRFCGDSG^ESEVRACREQLRERIQEFETC^LDLVEXEUXVS 
SRLRNTECDCVSGVKKEAP PGKKFVAQYYDE I YRVRVQSRWMTMSERLRBGVQACNKMLK 
AGI^EEDKVIJCEEEYV^YREERK^EKRLVGTKIVATQQRIQEFQPSDIVE^ 
DKARFLFNREDHS 

CPn 0011 15377 16614 

gaCB-(Petll2) Glu tRNA Gin Amidotransf erase (B Subunit 1 
FWYS IMTAAPAILHVS PTP BEETKFV I PKDSKSRALG ITLLWG I LLWCGA IVLSGVI S 
GLS AL I VCGLG I ST I SLGWLFVLGL I LLLRKRELTLEQ I EAKQ I AETFADELKELEMY I 
OSTEKSLEKIEGSRYSDMFLNRATQKILDLESSLSS ITSEFRDLRQLFDEEKI ELLSGE 
RX.LEFIAANLFKQGRDV)OjNLGNIJ^IRAYMGP ekakawhefivlttmar 
ELEFFF 

CPn_0012 / 16596 18212 . 

gatB-{Petll2) filu tRNA Gin Amidotransf erase (B Subunit) 

girvfflknkygwgwqe^rlleri^ynsvqksyadrlfsyektkmvhdtplipwee 

DKEKCAEAEKAFlSoOK ILLDYGKS I FVU^roEINU^FWSV^LmVRTRKVTQETv^DS 
ERW^WKVLIQI^DYEKLLEESSKESTEANKKLLSDLVDRLEDAKTKFFLKKQEEVETR 
VKDUlARYGGTJvDPKQDTEAKKKVELEASLETFLDS i eselvccledqdiywkeqdvkdl 
ARTQELEF^DaEAKREEAA£DLRSI^ERLKKSKTMLDP^AKWHIEKAEDS ITWrrSQIEMK 
DMKARUCILtolTSVLPEIDEIETCI^LEEI.PLLTTPXI^TKSYLKFKI 
WE2WIYv5eYEVQLQNLGFKLQG I SQRFGKKQDDF ANLEEQVALQKKRLRELTQNFEIQ 
GFTJFMKE^KAAAKDLYIRSTAEQFJ^DVrc 
AKKKLCsjLfd^EKELI^KEIKKEEFYQKKQQ^ 

CPru0Cfl3 18509 21106. 

pmp 2^-Polymorphic Outer Membrane Protein 
UlD»LAFFIYLLYWKESPLR£KKVVMKIPrJ?FLLISLVPTLSMSNLI^ 
FTCTTSTTS FSSKTSS ATDGTNYVFKDSWI ENVPKTGETQSTSCFKNDAAAGDLNFLGG 
GF^FTFSNIDATTASGAAIGSEAANKTVTLSGFSALSFLKSPASTVT^L^INVKGNLS 
^NDKVLIQDNFSTGDGGAINCAGSLKIANNKSLSFIGNSSSTRGGAIHTKNLTLSSGG 
-TLFQGNTAPTAAGKGGAI AI ADSGT LS I SGDSGD 1 1 FEGNT IGATGTVSHS AI DLGTS A 
KITALRAAQGHTIYFYDPITVTGSTSVADALNINSPDTGDNKEYTGTIV^ 
/ KDEKNRTSKIiQWAFKNGTVVIJ<GDVVLSANGFSQDANSKL IMDLGTS LVANTES I ELT 
NLEINIDSU^KKIKLSAATAQKDIRIDRPVVLAISDESFYQNGFLNEDHSYDGILELD 
AGKD IV I S ADSRS I DAVQS PYGYQGKWT I NWSTDDKKATVSWAKQS FNPTAEQEAPLVPN 
LWGS F I DVRSFQNF I ELGTEG AP YEKRFWVAG I SNVLH RSG RENO. RKFRHVSGGAWGA 
tRMPGGDTLSLGFAQLFAP^KDYFMNTNFAXTYAGSLRLQHDASLYSWSILLGEGGLR 
EIIXPYVSKTLPCSFYGQLSYGHTDHRMKTESLPPPPPTLSTDHTSWGGYWAGELGTRV 
JP/ENTSGRGFFQEYTPFVKVOAVYARQDS FVELGAI SRDFSDSHLYNLA I PLG I KLEKRF 
/AEQYYHWAMYS PDVC RSNPKCTTTLLSNQGSWKTKGSNLARQAG I VQASGFRS LGAAAE 
LFGNFGFEWRGSSRSYNVDAGSKIKF 

CPn_0014 21365 21922 

pmp 3-Polymorphic Outer Membrane Protein 

I QNQS I YFTMKS S F PK FVF ST FA I F PLSM I ATETVLDS S AS F DGNKNGN F SVRESQEDAG 
TTYLFKGNVTLEN I PGTGTA ITKSC FNNT KGDLT FTGNGNSLLFQTVDAGTVAGAAVNS S 
WDKSTTFIGFSSLSF IAS PGSS ITTGKGAVSCSTGSLSLTKMSVCSSAKTFQRIMAVLS 
PQKLFH 

CPn_0015 21335 24174 

pmp_3-PMP_3 (frame-shift with 0014) 

LEFDKNVSLLFSKNFSTDNGGAITAKTLSLTGTTMSALFSENTSSKKGGAIQTSDALTIT 
GNQGEVSFS DNTSSDSGAA I FTEASVT I SNNAKVSF I DNKVTGASS STTGDMSGGAICAY 
KTSTDTKVTLTGNQMLLFSNNTSTTAGGAIYVKKLELASGGLTLFSRNSVNGGTAPKGGA 
IAIEDSGELSLSADSGDIVFLGhrrVTSTTPGTNRSSIDLGTSAKMTALRSAAGRAIYFYD 
PITTGSSTTVTDVLKVNETPADSALOYTGNI I FTGEKLSETEAADSKNLTSKLLQPVTLS 
GGTLSLKHGVTLOTOAFTOOADS RLEMDVGTTLEP ADTST I NNLV t N I S S I DGAKKAK I E 
TKATSKNLTLSGTITLLDPTGTFYENHSLRNPQSYDILELKASGTVTSTAVTPDPIMGEK 
FH YGYOGTWG P I VWGTGASTT ATFNWTKTGY I PNPER ICS LV PNSLWNAF I D I SSLHYLM 
ETANEGLQGDRAFWCAGLSNFFHKDSTKTRRGFRHLSGGYVIGGNLHTCSDK I LSAAFCQ 
LFGRDRDYFVAKNOGTVYGGTLTi'QHNETY I SLPCKLRPCSLSYVPTE I PVLFSGNLSYT 
HTDNDLKTKYTTYPTVKGSWNDSFALEFGGRAPICLDESALFECYMPFMKLOFVYAHQE 
GFKECGT EAR EFGSS RLVN L.^L P IG I RFDKES DCQDATYNLTLGYTVDLVRSNPDCTTTL 
RISGDSWKTFGTNLARQALVLRAGNHFCFNSNFEAFSQFSFELRGSSRNYNVDLGAKYQF 

CPn_0016 24383 26188 

pmpl4- Polymorphic Outer Membrane Protein 

RS0FALKRGCHMRSSFSLLLIS:i3LAFPLLMSVSADAADLTLGSRDSYNGDTSTTEFTPK 
AATS DASGTTY I LDGDVS I SO AGKQT5 LTTSC FSNTAGNLT F LGNG FS LH FONT IS STVA 
nWVSNTAASCITKFSCFSTLRMLAAPRTTGKGAIKITD(XVFE;UGNLDLNENASSENG 
GA INTKTLSLTGSTRFVAFLoNSSSOOGGA I YASGD2V I SENAG I LSFCNNS ATTSGGA I 
GAEGNLVISNNON I FFDGCK.-%TTT JGGA r DCNKAGANPDP I LT LGGN ES LH FLNNTAGNSG 
^A I YTKKLVLSSGRCGVLF^NNKAANATP KGGA I A I LDSGE I Z ISADLCjN 1 1 FEGNTTST 
TGn PACVTRNA I DLASNAK FLNLRATR0NK7 1 FY DP ITSSCATDKL.SLNKADAGGGNTYE 
fJYrVFnc;EKLSEEELKKPDNLK:JTPrOA7KL.Av\C]ALVLKD(;VTV\VWrrTOVECiSKVVMD 
r0«;CTFF.A:^ AW'VTLNK II AT Nt DCLDCTI^ 

nL:;o*.ovi-Ti. [ ['.uiw tvmtvtl l i 'dti* : i .rirTNi Wt ;y<j< inwjnci .c,ui'.< -ni:knkkcyln 



z<A>n::t 



L 1 i.M 



rUink/EMDL . 
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V.'XSDSW.TrCGT.ILSRQALLVRACNHHAFASNFE'/FrfOF S3 R5Y A I DLGGRFGF 

CPn_00l8 275 1 i 2VO0 3 

pmD 5-Polymorphic Outer Membrane Proeein 

EVNMKTSVSMLU\LLCSCJA3S I VLHAATT PLNPEDGF IGECNTNTFS PKSTTDAAGTTYS 
LTGEVLY I DP0KGG3 rTGTCFVETACDLTFLCNGNTLKFLSVDAGANIAVAHVQGSKNLS 
FTDFLS LV ITE5 PKS AVTTGKGSLV5 LCAVQLQD INTLVLTSNASVEDGGVI KCNSCLIQ 
r-rKHSAIFYWrSSKK<73Ar:7rW 

■ ■ >'>■■-> -nKT"' ;r J- *.m. iiAii !•■.:« ^[/vrri.rrrri.i.LOKriiTMr^yiAL;::;^::;:^:^ 
'■■'iim', vn-r/'ri ;:.\t. ;aa::i y :vv'.m .f: :: irr/vri iatpl* v ;a : -urrrr.r-u:i.F?Q 

f"JGD I VF EGNQVTTTA PNATTKRNV I H LESTAKWTGLAASQGNA I Y FYD P I TTNDTGASDN 
LR INEVSANQKLSGS IVFSGERLSTAEAI AENLTSRI NQPVTLVEGSLVLKQGVTLITQG 
FSQEPESTLLLOLGTSL . 

CPn_0019 29007 30356 

pmp_5-PMP_5 (frame-shift with 0018) nne .„ 

ASTEDIVITNI^INADTIYGKNPINIVASAANKNITLTGTIJUiVKADGAFYENHTLQDSQ 

DYSFVKLSPGACOTIITQDASQKPLEVAPSRPHYGYCGHWNVQVIPGTGTQPSQANUW 

RTGYLPNPERCGSLVPNSLWGSFVDQRAIQEIMVNSS0ILCQERGVW3AGIANFLHRDKI 

NEHGYRHSGVGYLVGVGTHAFSDATINAAFCQLFSRDKDYWSKNHGTSYSGVVFLEDTL 

EFRSPGGFYTDSSSEACCNQVVTIIM2LSYSHRN^MKTKYTTYPEAQ^SW^ 

GATTYYYPNSTFLFDYYSPFLRLG£TTAHQEDFKETGGEVRH 

RFSIXKRGSYELTIJ\YWDVIRKDPKSTATLASGATWSTHGNNLSRQGLQIJU-GNHCLIN 
PG I EVFSHGAIELRGSSRNYN INLGGKYRF 




CPn_0020 

Predicted OMP (leader U4) peptide: outer membrane) 
KLWSNPNLRLMKRCFL FLASFVLMGS SADALT HQEAVKKKNSYLSHFKSVSG IVT I EDGV 
UNIKNNUnQANKVWEI/r/GQSLKLVAHGNVMVN^ 

RF AMYPWFLGGSMI TLT PET IVI RKGYI STS EGPKKDLCLSGDYLEYSSDSLLS IGKTTL 
RVCR I P ILFLPPFS IMPME I PKPPINFRGGTGGFLGSYLGMSYS PI SRKHFSSTFFLDSF 
FKHGVGMGFNLHCSQKQVPENVFTJMKSYYAHRXAIDMAEAHbRY^ 
GEYHLSDSWETVADI FPNNFMLKNTG PTRVDCTWNDNYFEGYLTSSVKVNSFQNANQELP 
YLTLRGY P I S I YNTGVYLENI VECGYLNFAFS DH IVGENF S SLRLAARPKLHKTVPLPIG 
TLSSTLGSSLIYYSDVPEISSRHSQLSAKLQLDYRFLLHKSYIQRRHIIEPFVTFITETR 
PLAKNEDHY I FS IQDAFHS LNLLKAG I DTSVLSKTNPRFPRIHAKLWTTH ILSNTESKPT 
FPKTAC ELS L P FG KKNTVS LDAEW I WKKHCWDHMN I RWEW IGNDNVAMT LES LHRS KYSL 
IKCDRENF I LDVSRPIDQLLDS PLSDHRNL I LGKLFVRPHPCWNYRLSLRYGWHRQDTPN 
YLEYQM ILGTKI FEHWQLYGVYERREADSRFFFFLKLDKPKKPPF 

CPl30021 34470 .32707 

Predicted OMP [leader (19) peptide] 

CSftSPY PNIEILARGVEHRSMGLFHLTLFGLLLCSLPISLVAKFPESVGHKILYISTQST 
TO^>TYLEALDAYGDHDFFVLRKIGEDYLKQSIHSSDPOTRKSTIIGAGLAGSSEALDV 
LSOAMETADPLCOLLVI^AVSGHI^KTSDDIXFKAI 

DHOiSF I HKLPEE IQC LSAA IFLRLETEESDAYI RDLXAAKKSAI RSATALQ IGEYQQKR 
FL^LRNLLTSAS PQDQEA I LYALGKLKDGQSYYNI KKQLQKPDVDVTLAAAQALI ALGK 
EF$ft£PVIKKQAI£ERPRALYALRHLPSEIGIPIALPIFIJCT^ 
ITT PULLEY ITEJUjVQPHYNETLALSFSKGRTI^N^ 
EQiLLTFLFPXPKEAYLPCIYKLI^QKTQL^^ 

Il'RAYADIAIYMjTKDPEKKRSLHDYAKKLIQETI^FVDTENQR^HPSMPYLRYQVyPES 
RTj^^DILETLATSKSSEDIRLLIQIifrEGDAKNFPVLAGLLIKIVE 

CPn|p022 35042 34395 

mat 

TILQVI SNCCNVSNTRSFYSMSLPLVLGSSSPRRKF ILEKFRVPFTVI PSNFDESK 
GDP^AYTQELAAQKAYAVSELHSPCDCIILTGDTIVSYDGRIFTKPQDKADAIQMLK 1 
NQTH&WTS I AVLHKGKLLTGS ETSQ ISLTM IPDHRI ES YI DTVGTLNNCGAYDVCHd 



32717 



30603 



CPfy3023 36657 35014 

yjjjK/alr-ABC Transporter Protein ATPase 
FJ^RA^LLYSKQHFVMLSAMSIVLDKIGKSLGTRILFDDVSWFNPGNCYGLTGPNfeAGKS 
TLLKI IMGM I EPTRGS I SLPKKVG I L RQNIDS FHDTTVLDCVIMGNTRLWEALOBlRDNLY 
LQEgDAIGMELGEIEEIIGEF^YRJ^SEAEEU,TGIGIPNEMFDKKMAMIPI(DLQFRV 
LLCQALFGHPEALLLI)EPTNHLDLYSINWLGNFI^^ 

IDXBTI I IYPGNYDDMVEMKTASREQEKADIKSKEKKI SQLKEFVAKFGAGSBCASQVQSR 

lrelkklqpqelkksnxqrpyirfplsdk$sgicwlsleaitkdygdhqvrhpfsleryq 
gdki^iignnglgkttlmkllagveapssgsiklghqaicsyfpq^sdvijCdc^ 

EWLRNRKTGINI^EIRSVIjGKMLFGGDDAFKQIQALSGGETARIXMAGMMu^ 
EANNHLDLESVSALSWAINDYKGTA I FVS HDRGLIQDCATKLL I FDKDK ^TFFDGTMVDY 
TAGHKQLL 

CPn_0024 37605 36661 

xerC-Integrase/recombinase 
REVM IAS I YS FLDYLKMVKSAS PHTLRNYCLDLNGLK IFLEERGNLA/PSSPLQLATEKRK 
VS ELPF S LFTKEHVRMY I AKL I ENGKAKRT I KRCLSS I KS FAHYCV/OK I LLENPAET I H 
GPRLPKELPSPMTYAOVEVLMATPDISKYHCLRDRCLMELFYSSGBRISEIVAVNKQDFD 
LSTHLIRIRGKGKKERI IPVTSNAIQWIQIYLNHPDRKRLEKDPoAlFLNRFGRRISTRS 
I DRSFQEYLRRSGLSGH IT PHT I RHT I ATHWLESGMDLKT IQAl/ghSSLETTTVYTQVS 
VKLKKOTHQEAHPHA 

CPn_0025 38610 37684 

elaC/atsA-Sulphohydrolase/Glycosulfatase 
I LMSSRELI r LGCSSQQPTRTRNQGAYLFRWNGEGLLFDPGEGTQRQF I FAN I APTTVNR 
rFVSHFHGDHCLGLGSMLHRLNLDKVSHPIHCYYPASGKKYFDRLRYGTIYHETIOWEH 
PISEECIVEDFCSFRIEAQRL^HQVDTIjGWRITEPDTIKBLPKELESRCIRGLIIQDLIR 
DQEISI GG3TVY LS DVS YV RKGDS I A 1 1 ADT L PCOAAI DLAKNSCMMLC ESTY LEQHRH L 
AELjMFHMTAKOAATL^KRAATQKLILTHFSARYL^LDDyYKEAGAVFPNIVSVAOEYRSYP 
FPKNPLLMK 



t:Pi\JH)2*] ■ V.»ri37 ]R7ft2 

CTH'j hypnrlit ; r. ir;j L protein 
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KHKi'vr.i.KKTVi'[,i'rEi.«.F.Mo:;:K"!NES:;o:;:i: :7P[vgrkailpwr'koptdp 



PS I RT I VDSTTNSDS P I LDI^VK LLDE3 EEE3 EDQSTERLLPS EL F I L PLNKR PF F 
PGMAAP I L I ESG PYYEVLK'I^^Rk V I GLVLTKK ENAD I LKVS FNQLHKTCVAAR I LR 
IMP I EGGS AOVLLS I EER I iWl K DKYLK ARVSYH ADNKELTEELKA YS I S I VSVIKD 
LLKLNPLFKEEWIFLGHSD^EPGKLADF5VALTTATREEW 

LL^Dl^RL^SINQKIEATITKSOKEFFLKEOLKTlKKELGLEKEDRAIDIEatFSER 
LRJWWPDYAMEVIQDEJE^ 

ivlnkdhygldeikcW^ 

FRFSVGGMR DEAE IKGHR^PTY ^^^^^^9^^^^^ T^p«?^p^- 

" '•a^ky 1 vpk '%??«/•;: :Va. :kv: ii- : \ • ka : ym ! ? invar eagvpti .r;- in r kkylrkv a 

LK IVONUEKPKSKKIt/k I JJKN UjTYLJKP I F J JURFYESTPVGVATGLAWTSLGGATL 
Y I ESVQVSSLKTDMHBfrGQAGEVMKESSC IAWTYLHSALHRYAPGYTFFPKSQVHIHIPE 
GATPKDGPSAGITMwSLLSLLLETPWNNLGMTGEITLTGRVLGVGGIREKLIAARRSR 
LNILIF PEDNRRDY£ELPAYLKTGLK I HFVSHYDDVLKVAF PKLK 

CPn_0028 / 43328 42543 

No robust hoAolog present in Genebank/EMBL as of 11/7/98 
RMF LQ FFH P I Vy S DQS LS FLP YLGKS SG 1 1 EKCSN I VEHYLH LGG DT SV 1 1 TGVSG ATFL 
SVDHALP ISKS£K 1 1 K I LSY ILILPL I LALF I KI VLR 1 1 LFFKYRG L I LDVKKEDLXKTL 
TPDQENI^LpSsPTTLKKIHAIJilLVRSGKTYNELrQEGFSFTKITDLGOAPSPKQDIG 
FSYNSLLPN^FHSLVSVPNISGEERAL^HKEC^EE^VKLKTMQACSFVFRSLHLPSM 
OTKDKKAGFpLLTFFPWKIYPL 

CPn '0029/ 43839 43390 

No robuit homolog present in Genebank/EMBL as of 11/7/98 

_Hl YCFNLFRYI RFFAALNIRMNDGLRFCYSY I LLRPMLLDSSLLRKGGQELL 
KKFOI^RTTSIKSSLISLRQQLGKREATQSDILYGTSRFQYLNSFEIEDPRIPPTMAAQ 
LQ E I TJvSRSVMELK I KF YVYLNS ERNKTKP 

CPnJ)030 43840 44529 

gcp/O-Sialoglycoprotein Endopept idase 
U<(WCWSLFFYIKNRJWYFYKYVIII7rSGYYP 
FLFKSKNLSFOGVAVAI^PGNFSATRIGISFACGLAMAKNVPLJJ^ 

LPLGKRGGVLTLSSEI PEEGLNEKRRGVG PGALLSYEEASDYCVAHGYYHVI S PNPQ 
FASSFSDKITVEEVAPSVEQIRJIHVISQFMFVEYDKQLS PDYRSYSC I F 

/CPn_0031 44708 44884 

'rs21-S21 Ribosomal Protein 
CMPSVKVRVGEPVDRALRI LKKKIDKEG I LKAAKSHRFYDKPSVKKRAKSKAAAKYRSR 

CPn_0032 44881 46098 

dnaJ-Heat Shock Protein J 

SLIGNWFVGSVSGMDYYS ILG ISKTASAEEIKKAYRKLAVKYHPDKNPGDAAAEKRFKE 
VS EAYEVLS DPQKRDS YDRFGKDGP FAGAGGFGGAGGMGNMEDALRTFMGAFGGEFGGGS 
FFDGLFGGWEAFGMRSDPAGAKCGASKKVHINLTFEEAAHGVEKELWSGYKSCETCSG 
CGAVNPO/3IKSCERCKGSGQWOSRGFFSMASTCPECGGEGRIITDPCSSCRGQGRVKDK 
RSVHVH I PAGVDSGMRXJQ4EGYGDAGQNGAPSGDLYVFI DVESHPVFERRGDDL I LELP I 
GFVDAALGMKKEIPTLLKTEGSCR1,TVPEGIQSGTILKVR^CX3FP^ 
VETPQNLSEEQKELLRTFASTEKAENFPKKRSFLDKI KGFFSDFTV 

CPn_0033 46129 48171 

pdhA&B/odbA&odbB- (pyruvate) Oxoisovalerate Dehydrogenase Alpha 
& Beta Fusion 

ERSMGWQNQVISSIRDVLKLVWELRFA£HKMI1X>SR0SGSGGTF0L^CAGHELAGVLAG 
KSLIPGKDWSFPYYRDQGFPIGtiXIDLSEIFASFLJ^TTP^SSAWWPYHYSHKKLBIC 
CQ S SWGTQ FLQ AAGRAWAVKH SS ADEVVYVSGGDG ATSQGE FHEMLNFVALHQLPL ITV 
I0NNWAISVPFEDQCGADLASLGRCHO/3LAVYEVIXMNYTSLTETFSHAVDQARQH 
ALILIDVVRLSSHSNSDNQEKYRSALDIJCLSMDKDPLILLEKEAINVFGLSPFEIEEIKA 
EAQEEVRJCSCEIAEALPFPSKGSTSHEVFSPYTETLIDYENSESAQNLRNSEPKVMRDAI 
SEALVE EMTRDSGVI VFG EDVAGDKGGVFGVTRNLT EKFG PQRCFNS P LA EAT I IGTAIG 
MALDGIHKPVVEIQFADYIWPGINQLFSEASSIYYRSAGEWEVPLVIRAPSGGYIQGGPY 
HSQS I EGFLAHC PG I KVAYPSNAADAKALLKAAI RDPNPWFLEHKALYQRR I FSACPVF 
SHDYVLPFGKAAIVHPGKDLTIVSWMPLVLSLEVAQELASRGISIEVIDLRTMVPCDFA 
TVLKSLEKTGRLLVIHEASEFCGFGSELVATMSEQGYAYLDAPIRRLGGLHAPVPYSKVL 
ENEVLPHKES I LQAAKSLAEF 

CPn_0034 49496 48210 

, CT345 hypothetical protein 
VNFLLPTTCRG I LMAE ISTPSLPDSS I VSQKTP PVPDPDS SPDH I PT I PTQAPFKPQRKK 
ETPSSIVNAIAFAILAFLSCLGGVFAICLGCSLEITMPLFILTAVFIAFTLLYFIHYLEK 
PKI PEPLPTPPPSPTLRAPTLTPEI PAPAPGI PLPPTLPKVDRTKLTCNPDIHYPSTYDP 
KACFSLLKQLFSLDPETRPEDRKYSNKLASILLRSKEKSGFRFHCFKGHFSHDKILNKKS 
GAWISSHSSMDFSTTLGRAFAVTTCLQRSCWEKIKNNIPTPEKHLPIGSCVSGPWDVEE 
GAOLYTSHLIVINPPTLETLIKEKMRRAITLKDFSMKEAFTNLVLAYLQCFDICIEHNLE 
SVQLEVFGLNNLSADQEEFTTWESCCHLALLESVR I LLASKE EYALSNVSVNS ISQVPLQ 
TACRALFLN 

CPn_00 35 51146 49569 

CT33 , J hypothetical protein 

ARTTLEEDAGSSLKPLPKTFPCATALYITHRRERKSEHQMWNRCOVFSSFFFRYPISSWL 
I RLRASCECFQQRHP I FLCGLYWLAG ITSRGH PECSALI LI FLGMFLPRNPKOWLPLASA 
WI I SLMLTPAPFLHDGP I SGTFVIHHAGGQGTYYGEALC IQTPCGKRAHHLSCQ I LSESR 
LELKKVfELEGTLHHTSQIVFKSNACYKEIPRSRFYIMKEKCRESSCHFLNHRFPSSEVG 
PFASSLLLCTPLPQNLRDLFROKGLSHLFAISGWHFSLCATTLWMLCALLPLKIKKILSF 
IVLTSLACIFPMSLSVWRSWISVTLLCFSWCFCGSCSGUmLGAGFILCSrFFSPFSPTF 
VLSFLATLGILLFFPKIFSFLYTPWTOFLSPFWLYPIRYLAMTLAISLSAOLFIVLPIMQ 
YFGSLPLEGLLYNL IVPFTTLP I IVFL I ATI I LPCCSPITEALIQGFLSHPWLHNPNILK 
TLS FAPVPPWMLT IJ^iZ LI LFFIG I LRTNVS PYASI SATSYRFI ETL 
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7 HRtL0Kr'JDAF33CFGEL 

f.'F'itJIOJH 'iJLt'J 'iJBll 

CT3~H hypothec lot 1 protein r 
MDTQ3STCNEEWR I ACT.*; I V3GMALCKVFFLGT3PLHVRELTLPQEEVEHEIHRYYKALN 

MGK r EE3LTAVRGMP3WDRVQD r HD I3NRVICHLCCCHKSSLCE3DQNL 1 I FSEELTPS 

fv as an:: a y r RGFvr: l vgaat:; hta r vsr aks r pylan i seelwn i akrync klvl idgy 

•■■;:-:m ■;*■[ at: :'/7Ai it: v*-.-*: I v::.:ha« ' l K i//ep t . L f* I f rr™ 
.:; ::.i , i-::Ki , [.Avn; '■■m-:-:^*/!^ .vRKi.Ai'r;- ;:\-r: :vr .[-: \A-\A- , 'z\: i y\i:riy t vwr:z 

R3 r RWLLDY3VI LEDQLQA IAKA3LQG3I KVL I PGVSDVS EI I EVKKKWET I RTRFPKGH 
KV3WCTM I EFPSAVWM IEE ILP ECDFLS IGTNDLVQYTLGISRESALPKHLNVTLPPAVI 
RM I HHVLQAAKQNQVPVS I CGEAAGQL3LTPLF IGLGVQELS VAMPV INRLRNH I ALLEL 
NCCLEITEALLQAKTCSEVEELLNRNNKITS 

CPn_0039 54256 53963 

CT339 hypochecical protein 

IS^SGYAKKKKEAKIMEQQFLEMEASIXEKRYEGQAGNGLVSWINGKCDLISVKVQPT 
CLDPEDPEVI EDLFRAAFKLAKEQMDQEMSLMRSTMPF 

CPn_0040 55673 54318 

dnaX-DNA Pol III Gamma and Tau 

AFYTHSLGYTOTL^PYOAoSRKYRPQIFREILGQSSWAVLKNALVFNRAAHAYLFSGIR 
^TGKTTIJ^ILAKAL^CVHLSEDGEPCNGCFSCKEIASGSSI^^ 
INETVLFTPVKAKFKIYIIDEVHMLTKEAFNAIXKTLEEPPQHVKFFFATTEIH^ 
LSRCQKMHLQRI PEKT I LEKLS LMAQDDH I EASQEALAP I AJUAQGSLRDAESLYDYVI S 
LF PKSLSPDTVAQALGFASQDSLRTLDNAI LQRDYATALG IVTDFLNSGVAPVTFLHDLT 
LFYRNLLLTNSTTSKFSSQYKTEQLLEI I DFLGES AKHLQNT I FEOTFLETVI IH I IRIY 
QRPVLSELISSIKSROFEGLRNIKEPTLTC^VSAPQPQPTTKEQSFLEKKNQPAAEGKII 
SVEVKS S AS I KSAAVDTLLQ F AWEFSG I LRQ 

CPn_0041 55888 57342 

No robust homolog present in Genebank/EMBL as of 11/7/98 

CKYLYHHSYPPPQHSVGSISSRYKLRVLAITFLVI^^ 

GLGIGLSALGGVLWSGLLCIiVKREVSKVCPEEIPAVQPEETPEX^^ 

QKEQKTQK I LDQLPQELDQLDRY IQEAFACLG PLKDLKYEEQGFLQDVKEEFQVFDFVQK 

DM I AEFVELQQI LCQEGRLLEFVINQTRY IGRDLFKREDSLYKLWEWLGYLPSGDVRGER 

LKKSAREVVDRFMRTTCNIRKIAMTFDRHVYSVAXTAFEKAFGALETCVYE^ 

FCEYEKAKLI^DEEXSAHAEORFQDIKNRWEDVKDAFFWVKEDGKIEIDDAIGNSCKWSE 

RYEEHRITRAJWYKVAEHQLFNATM1WKDSU*FJ<NEA^ 

LRT5^ELHDQELPRA0ERLRELQALYPEIAVSVVEARREVA5DLEKAHESIDKHYQSCVR 
EdgiY 

B.jJ 

CPr^0042 57346 58182 

Nafcobust homolog present in Genebank/EMBL as of 11/7/98 
EEEEKQEAEFRENGTK I RSMEEVSEYLC^VENQLESCSKRLTKMETFALGVRLEAKEEIE 
SiitiSDWNRFEVLCRDIEDMLSRVEEIERMLRMAELPLLPIK 

Tre^PYFKESPAYLTSEERI^SLNCTLQRAYKESQKVSGLESEVT^CREQLKDQVRQFET 
(^^LIKEEILFVTSTFRTKFSYHSFRIJWPCMRLYEEYYDDIDLERTRJ^WMAMSERYR 
DAT^AFQEMIJCEGLVEEAOAIJ^EYWLYREERXSKKKH 

CPfO)043 58432 60372 

Nojxpbust homolog present in Genebank/EMBL as of 11/7/98 

hh^jmqvplspqlpppppdhsvgasfclskfrviaitflvlgvlllisgalfltlgisg 

vsl^glglsaixjsvlvisgfllllerrevsgvglegiptcipvgpsaepsseeiqkkqk 

akq i ldqlpqeldqldtd iqhvlscwklkdlkckdrgli^akeki£vfdfvwkdmmme 

fvelqqvmdqesrylegli hevqs i ahklfvddvnirshlgescgylps edvrgellkrf 

akevvarfmkwrdirkiamaf^knaygaaknafdkafgsletclyksltksyrj^ 

kr^ilpdennsaraeqrfrevkdhwedlketvfw\^^ 

lilexrkdkvmshqlwfatmrvkeaevtysvaiwafekdgsg^^ 

dlri^echraqerlekltalypevsvswcteresxfnlekaygnleeryqsvvqdqedy 

wteqknreaefrakgtkvrsmeevaehlqi^ 

EYlILSDAANRLKVLCEDI edtlprveei emmlrmaerplhpikqaftkafvqynrcker 

LAft^P YYKES PAYVNSEERLQS LDQASQC IQR VPKGFKFRNGSMY I 
CPri|£>Q44 60278 60778 

NoytSbusc homolog present in Genebank/EMBL as of 11/7/98 
IAKsbCRWIRLHSAYKESQKVSSLETEACTYREYLREQVQQFETOGVSLIKEELLFLSS > 
TLKSKLSYDPL I AN I PCMKFYYQ YYDD I DKARAQSRWLEKSERYRNAKRRFOEIVKKGLF/ 
KEAKPLKKEEYRLLQEERSNKEKRLIYNKMAVARQRVQEFESMEIPE 

CPn_0045 60961 62790 

CT345 hypothetical protein 

CKYTYHPPQLPPDHSVGATSWQPKLR I LT ITFLVLGVLLL I SGALFLTLGVPGLAAGLS F 
GUJICLSAU&VLWSGLLFFLIRRGVSKVRPEEIPVTPSHEAQKILCQLPQ^ 
IQEWSCLGKLKDLKYEDQCLLTEVQ^KLRVFDFVRKDMVTEFLELCQWAQEX^^ 
INQVQS I SHKLFVPDVN IGAHL^ELCGYLPSGDVRVERLKRSARQ\AT3RF^VTG'DTRKV 
AMAFDENACGVAKNAFDKAFGALEECVYKSLTESYREAFYEYEKAK I LRNEDVEWLQDKN 

ksaraeqrfrevkdrwedlketvfwvkengcidl^ltavggwpdrgpehlipZkrrnkv 

MSHKLWEATMRMKGAEGTYSVARVAFEKDGSRKNQKKFQEKTKEWLRCLKDLbfDOECHRA 
RERLAELEALYPEVSVSWETERETKFKLETAYGNLEERYQSWRdQEDYWKEEENKEAE 
FREKGT KVRSPEEWEYLOILENLSEDCSKOLT I AE*/-/VLGVELEATAE FEYT I LS DAAN 
RLKVLCEDIEDILPRVEEIEIMLRIAELPFLPIKQAFTKAFLQYNSCKDKLAKVEPYCQE 
SVDYK3GFRV 

OPn_0046 h2775 63263 

No robus: homolog presenr. in Genebank/EMBL -is o/ 11/7/ 98 
ERFQ.1LN0DL0NVYQECQKATGLE3 EVSAYRDHLREOjITEFETCCLG'v IKEELLFVSSTL 
K3KL3YDPLIADIPCMKFYEEYYDGIDKARVQSRWLEK3ERYRKAKKGFQEMLKEGLFKE 
n0ALKKAEYRLLREKRMNKFKLLECNKIEAA00RV0EFGP3D3 

. A"lTi-!>04V _ ._/, o.lb'iS - 

Nr> tot.usr homolog pcv^-nr in Oen,'txink/EMBL /s ot" LL/7/9H 
KHRILKVT rWFUFVl.GC; [ LTM\'HPOK [RMTL'TTr/'JF'/mK/LRKDYELWFWr.^crEC 
KVKI/JT.'IIJMKWI. 

t.'lTi_')'14H ',ii-s v 7 nSROt 

'V'fll' Us ..rjiiJUM VK-(i hypothttt ioil IM r^rorVin 

MKFJ^HK:JYNRAIJIK[.:;ilOWVKY[-lATFVriCSK[VA{f^-AWr,KVLYVPEYKAGEISRIS 

i.TArw)F::L::w:;AiiKFYKhTAiiir:^^ 

!''i.L,:rrrjFVu:;:;T'jK( "LKDLr iyppllgkekktlf. tri i>ir;NKf jnivia^vfcmlk iflioen 



CPQPCFDA [MDILK [ Af IFE^^^JSCCVKCELLCKRC I EK CTKCTP I LEK YQR I DDRD 
AK I LKQLRAQLL3VHTLF3<^^pA I FVVLL I LLWGYCALK ALCPEMLKSPQR FMLY I A 
tLTL3LLWCPGTEIFCAW.^H^PPrLPFTAVL^YFL>GLPrACF3CTFIJVLLYTLGS 
DLWNNSWFLS I NLLC3WR I L VS LH R VS R LS SV FWAC MKLCGV AMCS LLM F R r FTNTISRE 
ALYAIXIIESFVYSLITAI3WALIPVFEA3FGASWFSLLTYLSPENALLKRLFKEAPGT 
YOHSVLVCSLAEAAAOAIGADSLYCLVAAHYHDJGKLINPGFFSENQKILOQSGHSLSPL 
ECAKM I MRH I P EGVNLARQ ACL PES F I CV I EEHHCTSVI RSAYYSHMVENPSTGSFDEEL 
FRYSGNKPSSKETT I IMIADSFEAASRSLKNAfiLPDLORLIDQI IQGKLQDGQFSCSPIT 
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Jl'n_U04'J 

No robust homolog present in ffienebank/EMBL as of 11/7/98 
LKEKRRNrVYLLVI YQEIFWLTMLHQPYYDKI LTGNT I Y I PGHTHKDSNKLFQKKSRAIW 
VDEKPFSLDCFSNVFLIFVSLVP I AGLyRAYQ IKKSLDRTTVQIGYSPSLSCEQKECVEA 
FVNGYGLIC IS ILGGLGILVPI LILW^SLLLLG ILMLFSLSTYES IKNYISKHICWKSN 
AT 

CPnJJOSO 66849 / 66499 

No robust homolog present in Genebank/EMBL as of 11/7/98 
VSWFPILGI FXAMRYAKHG/IW^ENVK^ 
QGCGILLPIFUXI^ILLISVLroLIMLPFRIXCFAI^ 

CPn_0051 667(T7 67111 

No robust homolog prafsenc in Genebank/EMBL as of 11/7/98 
CFAYLIAJWIPRMGNHETYffiPGVLPSSHAODVSRSTVYPSRSFIMRJWLMGWNFNRVPS 
KSSEOLMDGHRI PL I FFGIMHPT IS I LNVNRFSWLS I FYNGERG F 

CPn_0052 /68008 67304 

heme -Porphobilinogen Deaminase 
KML^VTCSDPCLSDFCMKRPLRIASRNSNIAKAQW 

GDREKKIPLHLVENSyFFTDGVDALVHKGVCDLAIHSAKDLPETPSLPWAITRCLHPAD 
LLWADHYVHEPLPL5PRIX3SSSUIRSAVU<0LFPCX%II^IRGTIEERLIX2LHRGH^ 
IVIAKAASLRLHLHpAYS I ELPPPYHALOGSLAITAKDHAGKWKQLFTP IHCHSS 

CPn_0053 / 69350 67986 
sms-Sms ProL 

I RMATKTKTC^CNCCGAT APKV^GCC PGCHNWNSL VEEWP^ SSRSSTSA IALS 
SIELENESRI^DHAGWDRILGGGVVRGSLTLLGGDPGIGKSTUXOT 
WCGEESVTOTSU^AKRiJilSSPLIYIJPETNLDNIKC^IATLEPDILIIDSIQIIFNPT 
LNSAPGSVADVREVTYE1MQIAKSAQITTFIIGHVTKSGEIAGPRVLEHLW 
SHANYRMIRSVKNRTOPTNEU,ILSMHADX3L^ 

SGAIXIE^ALVSSSPFANPVRKTAGFDPNRFSUXAVLEKRAOVKLF*^^ 
KI IEPAACLGALLAVASSLYNRJXPNNS IVIGEVGLGGE IRHVAHLERRIKEGKLMGFEG 
AILPEGfl ISSLPKEIRENFRLQGVKT I KDAI RLLL 

CPn_0*54 70089 69313 

rnc-Jtibonuclease III 

tls/fppikipnskfkdgaixsmhppiditaieaklnftftqpkixeialthpsyknesa 
ds erleflgdavlgl ivtehlfllfpsmdegtlstaras lvnakacc r yttmlg ig 
ljgkgekiqsergrlsayanlfesiix^vyxtjgglsparkltvpllppreeilplms 
qftqkqfrvlpvyqstavtdaqgnvsyo iqvlvnoevwgegnas skkeaek i 



^CPpdOOSS . 70096 70590 

T296 hypothetical protein 
^FWICYLIRIRMllSALHWHLRHFHNHGSILFEMXTIKDCFLLETKW^^FIAKASKTID 
RWRENIFRSMPEIYTWRJOIRI^FFAAEXVHRPKLSLVRDLWVFPGEEILEGEEDCML 
frLLLSGDRAGSG I FFTGPYPSDLYELEKGTTGLLLAFSSVG I PV I 

CPn_0056 70917 72746 

mrsA-Phosphomannomutase 

EFLKI^LHRISLMKEVEQRIRSLYDAVTAENICRWLSNDCTC^DAKTILGWLDTDPAQLE 

DLFGATLTFGTGGLRS LMG IGTNR INLFT I PJtTTQGLVQVLRAHLPHPG DPMRVWGCDT 

R^NSIEFAQErTAKVI^G>XXEVFLFQYPEPLALVSFTVRYERAIGGVMITASHNPPN^ 

YKVYMASGGQVLPPLDQEIVAACSAVNEILSVPSIDHPNIHLIGKEYEALYRDTLKQLQL 

YPEANRISGRSLSISYSPLHGTGISLWHVLKDWFLSVHLVEKQAIGrXIDFPTVQLPNP 

EDPEALTt^IEQMLAJJDDDLFIATDPDADRVGVVCLEDGOPYRFNGNQMASLLADHILGA 

WSKTRFIIXJEHDKLVKSLVTTEMLSAIAXHYHVDLINVGTGFKYIGEKIESWR^ 

GAEESYGCLYGTHVEDKDAIIASALIAEAALC^KLCGKTUTDAI^LYETYGYFANKTES 

\A/FSAKTDEQEIRKKLSHLEEISSANFFSGKYQVEKFEKYKQGIGFNLLSKDSYALTLPK 

TSMXCYYFSGGGRVIIRPSGTEPKIKFYFEMSTHYPERVTDKEIQKQREAESFQHLDDFI 

FDFKEKFSNL 

CPn_0057 72913 73554 

sodM- Superoxide Dismutase (Mn) 

I LKRYWMSFVPYSLPELPYDYDALEPV I SSE IMI LHHQKHHQ I Y I NNLNAALKRLDAAE 
TOQN12JELI ALEPALRFNGGCH INHSLFWETLAP I DQGGGQPPKHELLSLI ERFWGTMDN 
FLKKL I EVAAGVQGSGWAWLGFCPAKOELVLQATANQDPLEPLTGKLPLLGVDVWEHAYY 
LQYKNVRMDYLKAFPQ I INWCH I ENRFSEI I SSK 

CPn_0058 73627 74562 

accD-AcCoA Carboxylase/Transferase Beta 

I RWLVR LFSYDK PK I KVQK I KADGFSGWLKCNHCHEM I HANELCQNYNCC PKC S YH YR IT 
AIERVKLLADKDSWRPLYTDLKSQDPLEFIDTDTYANRLEKARKNTTESEGVIVGICTIG 
LHPVALAVMDFNFMAGSMGAWGEKLTP.L I EEA I ETRLPVI IVSASGGARMOESVFSLMQ 
MVKTSAALAKLHEAGLPY ISVLTNPT3GGVTASFAALGD 1 1 1 AEPKALICFACPRWAQV 
rGEDLPEGAOKSEFLLEHGMIDKIVERKELKTTLQTLLDYFLAQEYTGGKSKAPRDLSKR 
LKEIFLLTDDSE 

,CPn_00 c ,9 74 r >62 7^050 

dnr.- ( Ji;TP Nw<;leor LdohydrolaGr; ' 

r KHHTA.-JCNDN ITrNATLMTVFCELD.TOELPE-nTPGAAGADLHA^ 1 ALLR^RA 
LlPTf] £/AEI PEGYELOVR PR3GLALKHG ITVLJ 13 PGT r D3 DYRC JE I RV t L t NFtlDJTF I 

r e pkmp i ao wl:; pv\'oat f wkqe:: la ftarc.';gg fgi itcas 

CPn_0fj^0 7S004 ?WM 

ptsN-PTS IIA P Lutein 

RKLPEE-/EVLV I LE0AKMP^Yf?0M000F.'JLFLi"LL3PF(LVMFLGKH^RDE I LOU1 .TPI ,VDA 
AGLLEDKQAFFDALVRREN IM3TfJ ffJMr;-/A I PHCKLE3C:JfJFF I A IC t\ ITCt 1 1 LVIl V\ £ DtJ 
ALVRLVFL IGGPENAOAEY [ .KLL3TLTL3 r.,RKECRRC<jr*LOVNT lEEVMNVFVf ;M 
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i;pn_()()M 7S'i0l 7620S 
ptr,N PT:i MA Protein HTU DNA-Bindiri 

R^HECr(;c;DVKMDLKLDEVAJLLDVCEHTVt^WLKEGS^BfSMNNE'/RF3REEIENWLL 
HNQALM rOERt ;EDKEALKDL3LKY3LYKA ihrcgvlcdvwhskeealqyaskyiaqkfq 
LDESVLFEML:WRENLM3TCrGEGIALPKAKDFLINAYYDIWPMFLAEPIEYGALDGKP 
VC I LFFLFACODK.jHLNLVNK IVHLCMSLNAKSFFKNYPNKDQLLAYVKEWESQTH 



in^^^fti 

ECJ^PFSM 



E/\TYA I DRKAHKKP T EN 1 1 
LKLNLNOPM PYC F"MPECtJ 
IESNH 



MM I I K!I'.-M I NQKOLNAL [ EINRNNCTDPATANLLAS 
PLDLNNNSrPOI lARACOCrMTLGCTLQC IKKEPDRI 



76251 



77690 



t I.- 



kk: :!•"' " 'V- inn !!■ ! .i ' ;i * \- :pmm: :: :kpt: :k t AVI,;; : : : "!-"Pl l:: f iFAriA: I": :v« :r ; rrw its: 
bWKKPQKCSERKQAXKEPRAJlKGYLVPSSRTUoARAOKMKNSSRKEoSGGCNEISANST 
PRSVKLRRNKRAEQKAAKQGFSAFSNLTLKSLLPKLPSKQKTSIHEREKATSRFVNESQL 
SSARKRYCTPSSAAPSLFLETEIVRAPVERTKELQDNEIHIPVVQVQTNPKEQNTKTTKO 
LASOASIQOSECTEQSLRELAQGASLPVLVRSNPEVSVQRQKEELLKELVAERRQCKRKS 
VRQALEARSLTKKVARGGSVTSTUIYDPEKAAEIKSRR^CKVSPEAREQKYSSCKRDARA 
NGKODKTTPSEDASQEEOQTGAGLVRKTPKSQVASNAQNFYRNSKNTNIDSYLTANQYSC 
SSEETWPCSSCVSKRRTHNSISVCTMWTVIAMIVG^ 

CPn_0063 78109 78267 

No robust homolog present in Genebank/EMBL as of 11/7/98 
PMYANCKHNCLCLYDFSRHRSPPGLPLTFTPPYSFTLGIFLGRCLSTSNIVLL 

CPn_0064 78340 78576 

No robust homolog present in Genebank/EMBL as of 11/7/98 
LVMTK IQC S AQYYR S RPAERAQTP PQ PFLARDRADFWERH PRFSACCRVTXLVAWWLAL 
LFLFVMLLPLAAGSYLLAF 

CPn_0065 78882 80651 

CT288 hypothetical protein 
YDYYKYNMFFKKNY>rrDFPTHFKGPKIJ*PIKWPNFFE^ 

SGIVLIIGT PLGAP ISM I LGGCLLASGGALFVGGT I ATI LQARNSYKKAVNQKKLSEPLM 
£RPELKAIJ}YSLDIJ<EVWDLHHSWKHIJ<j<LDL 

M I S ENYDAC LJCMIAYREEIiKEG/TOYQETRFNQNLTHRNKVLLS I LSRITDN ISKAGGVF 
SLKFSTI^SRMSRIHTTTWIIAI^WSVMVVA^ 

LSYLVRQ ILSNTKRNRQDFYKDFVKNVDI ELLNC'Tm^RFI^EMUCGVUCEEEEVSLEG 
QDWYTQYITNAP I EKRLI EEIRVTYXEIDACTKKMKTDLEFLENEVRSGRLSVASPSEDP 
SETPIFTQGKEFAKUIRG^SONISTIYGPDNENIDPEFSLPWMPKKEEEIDHSLEPVTKL 
EPGSREELLLVEGVNPTIJlEL^^IAIJ>GXX3LSSVRKWRl^PRGEHYGNVIYSDTELDRIQ 
MLEGAFYNHLREAQEEITQSLGDLVDIQNRILGIIVEGDSDSRTEEEPQE 

QPn_0066 80916 82655 

Nb^robust homolog present in Genebank/EMBL as of 11/7/98 

g\fymanptqsrppspeisieelexqelagssnt 

nsjedeeg plg sc evydwc i tnqgdpevrdhevrvmyingsgrtqheg i ldamnicdlrg 
e^mfihnsgygi^scfu3irj^ippp^nvisqaiqarwneffifaenanrdyivlfsgn 
galylqvaxdnsiyshhilcvgigssyyiqgnyhvhhtr^vtgdwttlixrj^ j 
U|S^saeglflpsvrcpsygwai^cgecx:limcnncx3vgfiipqdssseialv^^ 
s^rliewidrgdsqavlei^pqpshcrdialtalyattrissi^eci^isvtyape' 7 

fvj3yai vtg ys i mtlryf i llltnrpgcrriifrvijujuu/si^slgfltvlldh 

vjnrrp pl i svi fctasfatgsf i yvdltrwfftslrs rlqlfvqrrltgrglbl 
mjslrfsqhlalitfhgglfwpliigffnqlviqvprwirpnttawdli^sqea^s 
gies^igcrinfllcmii^vintfffvrsvrrnlhrrphr 

Cf|l0067 82920 84053 

No 11 robust homolog present in Genebank/EMBL as of 11/7W98 

kgsgysyrgppmavegrvnssqaij^odcoevlankqskgi^crilsiwaviMagvw' 
li altlas i ltsvpylalgvfll i vtlgc 1 1 falcseki kkvpptp ishkee i dftwfee 
k^emekekedpehfgrtatdipmrsaldqfnhschhihespaltety^ 
cp\^lpdvtseeevlirswgsyllmeacvpkvsmlidelhnkij(spse 
QR^flftqkdlatfflaytrvnkshiapfrag^ 
cyyarlafnqtqrlyhqlfnveklrs iyanmdkdplchpwaf ipiydllktmhgdgfle 
ooedreypsraaqdqfwg ' 

. cte0068 84909 84331 

CT?$0 hypothetical protein 

SFWiKKFFIYSLIFSCSFSAPLKGICNEDVSSQSRIEEDPEVLITQLNrfLIETPIEEGKE 
I RfELQA I S DGQKSSEEIEESCGTSDSEGLSEKTDKESSNEYVLDFFDBMVQRLEG IS KM 
CQSGQVAQ 1 1 DCFNREFDI RNRELELKNRELELREKDLEFKKS ILDWjJkEKVSRELAFQR 
EQD IKQTLMLLKK ' 

CPn_0069 85191 87086 

No robust homolog present in Genebank/EMBL as/ of 11/7/98 
LNFLYVYLLIFNLGIMTTPPPSRSSSPPPYDWIEL^DLGNTNNNBSRATPPPPEVGGELP 
PYFSASNFWIERGAPSLPSPQOLLSLPEYSRQPPPGYFDETAdtTSRTSEEMFGTLVST 
LCCPANSERDWEDHEVNC IY IASTSDTCLEAVQGCMHITELRgI PVRVLYETGHLYAFAR 
ENTCHSRLEVSHTVRAMTYFWDRFFSRHWNVGRRFLVTTfQGNTCAYVQAALDSSMHTQDI 
WLGLSPTVY IRGNYHVQHYRVRGFWPSCLDSLAACAfKTS^tPYGESSDGI FYPSLFSH 
TFDNAIRYGERCLLVCSEGMGMLPETQOOTSPLTSLEGGHEVALVLNPQQNPEALSIASR 
LMHEERGGRLES^>!PGRSSNPFMTSMYVLVRLNTLAQIYlfl4SPYYSF0SNDIVCLIFIS 

gaavetvs y i fltvtdstcgrrylrvprlvctglrnlalp/tllelli LSYPRSVEGVPF 

NVRF I LGYMCTT R WFFAWNLI LHWPFRCLRHG IQLFVHJlS I IGHTLGAR ITDLTLASMR 
YA I VFP5 1 V5SC LLTALAH ANTN I LALDPYRL I ESGDLBR PAFNDDEMQQADNPWDAYS I 
GLVINTC lYMLrLFANLIFMVYSVRRYHRSRR 

CPn_007() 87399 87208 

No- robunt homolog present in Geneban//EMBL as of 11/7/98 

ykvclfhlknonffsnosrtyeorfpkvsphfesi/rlqsvgfsscgtllisfrotelkr 

DLYI ' 

r ;pn_f)()7l 880*6 87509 

'*V\2 c j hypothetical protein 

_i k:;lp.:; [ lef .[cplqharc lkkqhk i ieelep/pfokdhlylklmensssrdafdkkrml~ 
kp: nlw< widlylyevyqdc I lffftytkai/iccc iaglftevysgetpst r LTr kp r F 
i' ( ji< lt f ' y l; ; F f ;n [ jeslymrmkqiavqyl/Cpp&t 

'■■I'n.'KJV;: H')15L HR()/7 

|.Tj:m hypothetic.! I protein / 

I^JY^rJTKTSVYKEKVtaLCYCLLFYFFFl^JTPL.-.-CGrSP^DOYVPQELFCDRLGGSR 
:.N:.PU::NA:X;D:;iMVr:pPI^ALVALTDL/LVPYNOr]aF^WTTRLKNAVEKrCLFLQRNWK 
ftLLfll .AWAL r I .VCHHTV ALTLT [WuA/CLC IGWFG [ FTATCLDKENKHRHVNS LWNL 
Itmi f Ujl.IjWJf ;TR0 1 LLATM T AS I SAL [ YAVPQAVGLV IGF.'? INTVYCARLCD 



CPn_0073 / 89353 89574 

inf A- [nit iatipn Factor IF-t 

SMA^KEOTLVLEGir/EEIiPCMHFRVILENCMPVTAHLCGKMRMSNIRLL'/CDRNmrtMS 
AYDLTKARWYRHR 

•■|'n_-'H 

tut A- Elongation Factor Tu 

EDFEMSKETfQRNKPH IN IGT IGHVDHGKTTLTAAI TRALSGDGLASFRDYSS I DNTPEE 
KARG I T I NAs HVEYET PNRHY AHVDC PGH AD YVKNM I TGAAQMDGA I L WS ATDGAMPQT 
KEH I LLARpVGVPY I WFLNKVDM I SQEDAEL I DLVEMELS ELL EEXGYKGC PIIRGSAL 
KALEGDAYY I EKVRELMQAVDDN I PTPEREI DKPFLMP I EDVFS I SGRGTWTGRI ERG I 
VKVSDICTOLVGLGETKETIVTCVEMFRKELPE£RAGQ 
NSVKPOTKFKSAVYVWKEEGGRHKPFFSGYRPOFFFRTTD\rrcV\n , L 
VELDV^LIGTVALEEGMRFAIREGGRT IGAGT ISKIMA 

CPn_rf075 91087 91350 

secf^-preprotein trans locase 

SRSWFMKQQHNRKALSRK IGTVKKQAKFAGS F LDE I KK I EWVS KHDLKKY I KWL I SIFG 
FG9AI YFVDLVLRXSITCLDG ITTFLFG 

C*n_0076 91334 91903 

usG-Transcriptional Antitermination 
^PFCSVfcOfYKWYWQVFTAQEKJO^KALEDFKESSGMTD^ 

~Y IWPGYLLVKMHLTDESWLYVKSTAG I VEFLGGGVPVALS EDEVRS I LTDI EEKK 
I SGVVQKHQFEVGSRVKINDGVFVNFIG^SEWHDKGRLSVMVSIFGRETRVDDLEFl^V 
EEVAPGQESE 

CPn_0077 91956 92435 

rlll-Lll Ribosomal Protein 

FFVSYPLFVEVSC^KVRFSMSVKK^KIIKLOIPGGKANPAPPIGPALCAAGVNIMGFCK 
EFTJAATQDKPGDLLPWITVYADKTFTFITTCQPPVSSLIKKTLNLESGSKIPNRI^^ 
TQAQVEAI AEQKMKDMD IVLLES AKRMVEGTARSMG I DVE 

CPn_0078 92453 93160 

rll-Ll Ribosomal Protein 
SCRIMTKHGKRIRGILKOTDFSKSYSIJIEAIDILK^ 

C^IRGAVFLPNGTGKTLRILVFASGNKVKEAVEAGADFMGSDDLVEKIKSGWLEFDVAV 

TPDMMREVGKI^KVliGPRNI^PTPKTGTVTTDVAKAISELR^ 

KLSFESSQIKENIEALSSALIKAXPPAAKGQYLVSFTISSTMGPGISIDTRELMAS 

CPTU0079 93170 93688 

rllO-LlO Ribosomal Protein 
RGKMKQEKTI^EVEDKISAAC^FIIXRYUOTAAYS 

FKAI EAAGLEVIXrSDTDGHLGVVFSCGDPVSAAKQVLDFNKQHKDS LVFLAGRMDNASLS 
GAEVEAVAKLPSLKELRQQWGLFAAPMSQWG IMNSVLSGV I SCVDQKAGKN 

CPn_0080 93720 94121 

r!7-L7/L12 Ribosomal Protein 

VRVTFCVTTESLETLVEKI^NLTVLELSQLKKLLEEKWDVTASAPWAVAAG 

EPTEFAVTLEEVPADKKIGVUCWREVTGLAXKEAKEWrEGLPKT^ 

KLQDAGAXASFKGL 

CPn_0081 94219 98016 

rpoB-RNA Polymerase Beta 

FREI LSHONSRRTRMLKCPERVSVKKKEDI PDLPNL I E IQ I KSYKQFLQ IGKLAEERENI 
GLEEVFRE I FP IKSYNEATVLEYLS YNLGVPK YS PEECI RRG ITYSVTLKVRFRLTDETG 
IKEEEVYMGTIPLMTDKGTFI INGAERWVSQVHRS PG I NFEQEKH SKGN I LFS FR 1 1 PY 
RGSWLEAIFDINDLIYIHIDRKKRRKKILAITFIRALGYSSDADIIEEFFTIGESSLRSE 
KDFALLVGR I LADN 1 1 DEASSLVYGKAGEKLSTAMLKRMLDAG IASVK IAVDADENHPI I 
KMLAKDPTDSYEAAIJ<DFYRRLRPGEPATLANARSTIMRLFFDPKRYNLGRVGRYKLNRK 
LGFSI DDEALSQVTLRKEDVIGALKYL I RLKMGDEKACVDD I DHLANRRVRSVG EL I QNQ 
CRSGLARMEK I VRERMNLFDFS S I7TLTPGKWSAKG LAS VLK DF FGRSQLSOFMDQTNPV 
AELTHKRRLSALG PGGLNRERAGFEVRDVHASHYGR IC P I ETPEGPN IGL ITSLSS FAK I 
NEFGFIETPYRIVRDG IVTDEI EYMTADVEEECVIAQAS ASLDEYNMFTEPVCWVRYAGE 
AFEADTSTVTHMDVSPKQLVS I VTGL I PFLEHDDANRALMGSNMQRQAVPLLKTEAPWG 
TGLECRAAKDSGAIWAEEDG\A/DFVTCYKVWAAXHNPTIKRTYHLKKFLRSNSGTCIN 
CX3PLCAVGDVrTKGDVIAIX5PATDRGELALGKN^VAFMPWYGYNFEDAI I ISEKLIRED 
AYTSIYI EEFELTARDTKLGKEEITRD I PNVS DEVLANLG EDG I IR IGAEVXPGDI LVGK 
ITPKSETEIAPEERLI.RAircEKAADVKDASLTVPPGTEGVVMDVKVFSRKDRLSKSDDE 
LVEEAVHLKDLOKGYKNQVATLKTEYREKLGALLLNEKAPAAIIHRRTAEIWHEGLLFD 
OET I ER IEQEDLVDLLMPNCEMYEVLKGLLSDYETALQRLE INYKTEVEH I REGDADLDH 
GV I ROVKVYVAS KRKLQVGDKMAGRHGNKGWSK I VPEADMP Y LSNG ETVQM I LNPLGVP 
SRMNIXIOVLETHLGYAAKTAGIYVKTPVFEGFPEQRIWDMMIEOGLPEDGKSFLYDGKTG 
ERFDNK WIGY I YMLKLSHL I ADK I HARSIGPY3LVTQQPLGGKA0MGG0RFGEMEVWAL 
EAYGVAHMLOE I LTVKSDDVSGRTR I YES I VKOENLLRSGTPESFNVLI KEMQGLGLDVR 
PMWDA 

CPn_0082 97992 102221 

rpoC-RNA Polymerase Beta' 

CSSYGRRRLKNDVLEKIMFGEN3RDIGVLSKEGLFDKLEIGIASDITIRDKWSCGEIKKP 
ETINYRTFKPEKGGLFCEKrFGPTKDWECCCCKYKKIKHKGIVCDRCGVEVTLSKVRRER 
MAH I ELAVP I VH IWFFKTTPSR IGNA/LGMTA^DLERV I YYEEYW I DPGKTDLTKKQLLN 
DA0YREVVEKWGKDAFVAKMGGEAIYDLLK3EDLQSLLKDLKERLRKTKS00ARMKLAKR 
LKirEGFVSSSNHPEWMVLKNIPWPPDLRPLVPLDCCRFATSDLNDLYRRVINRNNRLK 
AILRLKTPEVTVRNEKRMLOEAVDALFDNCRHGHPVMOACNRPLKSLSEMLKCKNCRFRQ 
NLLGKP.VDY3GRSVI rVCPELKFNOCGLPKEMALELFEPFI I KRLKDOCUVYTIRSAKKM 
IOROAPEVWDVLEE I^IKriHPVI.LNPAPTL.HRU; [QAFEPVL I EGKA [ R IHPLVCAAFNAD— 
FDC;rX?MAV}!VPLSVFJ\OF J EAKVLMMAr'rjN I FLPrj.'JHKPVA [ P.'o KDMT I X 1 L\"Y LMAD PT Y F 
PEFHGTJKTK IFKDEI EVLRALNNK/JF I DWF'MRDZTr'.RC, I li [ HEK I KVR [ DHQ I I ETT 
P f '.RVLFNRrVPKEU3FOWSMI-KR[:;EL[I/^;YKKVr;[..EATVHF[^.KDLT,FI^ 
Al:JMC;r,KDVR I PDIKSH ILKDAYOKVArVKKOYDCXi [ ITEGEHHSKT I:J n/rFA/3E0Lr>D 
AL.YVf' r^KOTRSKHNPLFLM tD.'7;Ar' f iNKj'Ot.Kor/'AL.RGLMAKPNCIA I IKSP [T.'"NFRE 
'U/rVLEY/JI.SSHCARKCLADTALKTADlJiIYI/Pf'RLVDVAODVriTEKWro-TLNIirEISAr 
GV WEELLPLKDR n\;RTVAKDV/0^;DK::«|.I^y:;(jrjVI,N.*;VOAEAIDDAG I FT t K [RS 

TL*rt:i-::;i'RcvcAKCYi:Li*LANi ;hl inn ikav ; r 1 aaq : ; 1 r ;kpc avr/mHTKMi/ v ; 1 aats 

^^l'F-KTN:;iX3[LVTMDLRWr/^E';MNtA/[JIKKf:At.lW;Dmirrr^^ 
:iLKVf--pvEU;VK Il.VADGTPV: \<SW I'lKVULHIIIPl rCDKmFlKYKnt.VF-S J ISTEKW 
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NKNTCLVEL IVKQHRGEUiPQ I ArYODAOLGEL'ATT^^^^r I3VEEGQRVDPCMLLA 
RLPRCA I KTKD ETGCLPRVAELVEARKPEDAADI AK T^^VkG IQKNKRILWCDEMT 
GHEEEHLrPLTKHLIVQRGOSVIKCWLT[X}LWPHElWKCVREI^KYLVNEVQEVYR 
LOCVD I NDKH t E I 1 VRQM LQKVR ITDPGDTTLLFGEDVNKKEFYEENRRTEEDGGKPAOA 
VPVLLC ITKArJLGTES F I 3 AA5FQDTTRVLTDAACCSKTDYLLGFKENV IMGHMI PGGTG 
FETHKR IKOYLEKEOEDLVFDFVSETECVC 



HL.WMEVr:>LD3r 



CPn_0nS3 



102296 103312 



: -\ i fjsi.-akvpm: :;inr-T<.'! .FKLrrr t v f :r ■. ■< :r >r- ■■»: : f vi iA"Tf irr;i . f lkva'.'F.pkf 

J E LLNEAWWG I RQNG DOLCTLSF I LOK I QVNFALE I I KNI PGR I S LE I DARUJ FNVEAM 
VQRAVFLSQLFEAMGGDKKRLLVKI PGTWEG I RAVEFLEAKG IACNVTL I FNLVQAIAAA 
KAKATLISPFVGRIYDWWIAAYGDEGYSIDADPGVASVSNIYAYYKKFGIPTQIMAASFR 
TKEQVLALAGCDLLT I S PKLLDELKKSQH PVKKELDPAEAKKLDVQPI ELTESFFR FLMN 
EDAMATEKLAEGIRI FAGDTQILETAITEF IKQI AAEGA 

CPn.0084 103356 103751 

predicted ferredoxin 

SE^KNKMDYKSQLWSCPCCCKGNVCFSVFNLDVrLTCNVCSSTYTFDSVIRNEIRQFVA . 

LCKRIHDANSILGNATVSVSVEDNQMDIPFQLLFSRFPVVLNLSLDGKKIAIRFLFDALN 

TSILHQESDLIS 

CPn_0085 1045L2 103766 

CT311 hypothetical protein 

FSMKFFILFILIVAQFPAFSAQPRTQVSASHSKQAKARRTSRIRSSAATNASVSRYKTRA 
AARKK I GKFEKK PSLS PVQWVRYSGKNYS IQT PSLWQC I DDKTQL PEKLDVLL IGKGKGN 
LT PT IN I AQEITSKSSKEY I EE ILAYHKANEMTLESG IFTQIQSPSGEFTI IKTEKNSSW 
GR VFCLQATTVI DHTAY IFTST ATLDDYAELS FTFLKWSS FQI RGGKEATSGDA I LEKA 
LEALQNENK 

CPn_0086 104898 105527 

atpE-ATP Synthase Subunit E 

NI MANLNADGKLKQ I C DALRLDT LK PAED EAAALLHNAKEQAKR I I QEAQEEARKI LETA 
EERAHQKIKQGEVALSQAGKRALEALKQAVENKIFTIESLVEWLEHVTTDPEVSTKLIQAL 
VQ ALEAQGVSGNLTAY IGKHVS PRAVNELLGKAVTTKIJIKKSVVVGSFVGGVOIJCVEEKN 
WVLDLS S SALLE I FTRYLQKDFREMI FQGS 

CPn_0087 105540 106376 

CT309 hypothetical protein 

SHEXI FS I FKVWMTQYYFLSS FLPTQLPESVPLFS ISDLDDLLYLNLS ENDLCNYGLLK 
RFPDFENFAFFWAGKP I PFSFGEVTQET^ERMLSSQCWSDDNDFEDFFKDFLMNHKSSQD 
Rl2WFSDLFR£FLSYHQTNSSKFI^DYFRFQQQU*VVL^^ 

DPX^LEVLMQKDSPNY EL PEEFSDLOGVLDDYGLLPHTLNRALALYQFHKLEGFC SDSY F 
D£jfe]y I LARC ATYMFA I RNS LASVEKG RE IINHIEKAIKW 

ckl0088 106352 108145 

CT-258 hypothetical protein 

SYkKGNQMVTVSEOTAQGHVI EAYGNIXRVRFDGVVRCXSEVAYVNVDhrrWUCAEyi EVA, 
GE^QVFEDTOGACRGALVTFSGHLLEAELGPGU^I FDGLQNRLEVLAEDSSFL 
Kf^yNAISDHNLW^^^ , PVASVGDTLJU*GDLL^^ 

Gf^AHTWAKARDAQGKECAFTMVQRWP I KQAF IEGEKI PAHKIMDVGLRIL 
KGGTFCTPGPFGAGKTVl^HHLSKYAAVDIVII^ACGERAGEVVEVl^EFPHLID&(^K 
SL&LfRTC I ICNTSSMPVAARES S I YLGVT I AEYYRQMGLD I LLLADSTS RWAQALRKl SG 
RLEEIPGEEAFPAYLSSRIAAFYFJlGGAITTKDGSEGSLTICGAVSPAGGNFEEpyTOpT 
LAWGAFCGLS KARADARR Y P S IDPL I SWSKYLNQVGQ I LEEKVSGWGG AV 
G$EIGKRMEVVGEEGVSMEDMEIYU<A£LYDFCYI/MNAFDPV^ 
R t F'DAKFVFDSPDDARSFFLELQSK I KTLNGLKFLSEEYHESKEVI VRLLEK7MVQMA 

CPnl0089 108111 109466 

CT-2S9 hypothetical protein 

LDCWKKQWYKWRKDMQT I YTKITDI KGNL I TVEAEGARLGELAT ITRS DGRS SYASVLR F 
D^K^I^VFGGTSGLSTGDHWFLGRPMEVTFGSSLLGRJU^IGKM^ 
E f AT PT FN P VCR I VPRSMVRTN I PM I DVFNCL VKSOK X P T Ffi SSH E WWM AT 1 .MB T A Afyr n 
A^rW * GGMGLT FVDYS F FVEESKKLG FADKCVMF I HKAVDAPVECVLVPDMALACAEKF 
AVEEKKNVLVLLTDMTAF ADALKE I S I TMDQ I PANRGY PGS LYSDLALRYEKAVE I ADGG 
SI;T|rTVTTMPSDDITHPVPD^m^ITEGQFTLRNNRIDPFGSI^LKQLVIGKVTREDH 
GDI^ALIRLYADSRKATERMAMGFKLSNWDKKI^FSELFETALMSLEVNIPLEEALDI 
GWK I LAQSFTSEEVG I KAQLINKYWPKACLSK 

CPn_0090 • 109439 110080 

atpD-ATP Synthase Subunit D 
VIJVKSMSVQVKLTKNSFRLEKQKLARLCTYLOT 
QAYER I YAFAELFS I PLCTDCVEKS FEIQSI DNDFEN/AGVEVP I VREVTLF PASYSLLG 
TP rWLDTMLSASKELWKKVMAEVSKERLKILEEELBttVS I RVNLFEKKLI PETTK ILKK 
IAVFLSDRSITDVGQVKMAKKKIELRKARGDECV 

CPn_0091 110074 112053 > 

atpI-ATP Synthase Subunit I 
VRLNIHKYLFIGRNKADFFSASRELGVVEFIS^KCFITTECGHRFVECLKVFDHLEAEYS 
LEALEFVKI)ESVSVEDIVSEVLTr^KEIKGrXETVKALRKEIVRVKPLGAFSSSEIAELS 
RKTGISLRFFYRTHKDNEDLEEDSPNVFYLeTAYNFDYYLVLGVVDLPRDRYTEIEAPRS 
VNELQVDLANLQREIRNRSDRLCDLYAYRftEVLRCLCNYDNEQRLHOAKECCEDLFDGKV 
F A VAGWV IVOR I KELQSLCNRYQ I YMERffPVD PDET I PT YLENKCVCMMGEDLVQ I YDT P 

aysdkdpst wvffaf vlffsmiwdag/gllflmssllfswkfrrkmkfskhlsrmlkmt 
arlglgcicwtri'l'sffgmsfskts/fre-rsmth^/lalkkaeyylqmrpkaykeltney 

PSLKAI RDPKAFLLATE IGSAG I ES/YWYDKF I DN I LMELALF IGWH LSLGMLRYLRY 
RY3G TCW I LFMVS AYL YVP I YLGT/SL IHYLFHVPYELGCQ IGYYGMFGG IGLAWLAM I 

qr::wrgvee i r sv iovfsdvls y/r i yalglagammcatfnqmcarlpmllgs i v i llch 

SVTII I L3 TMOGV IMCLRLNF lE^HYSFDCGGRPLRPLRK IVCSEDAEASC IHLDNNS I V 

w-ujwr* l 1/121 LL257 3 

■--.itpK -ATP Synthase sybiihir~K~ " — -- - " . - - • - 

PYI.KflAHKVSM t DM:;W(^LVUGLAMIG^AIGC;;MACVA.'JHAVM;1R IDEGHGKLIGM3A 

MF-:::;y:; i yci- r lmllmo*\ rKNCTL3PVCG r a k;l;;vca,\llvs::v'mogkc''-7:^ iqaya 
h:;:;:: i v ;kcyaa u ;i vuTifslfavvfallll 

':in_0fj'.M / H2AM) I liO 1 5 

f.T:r) i hy|K'.r(n..^,;,,i protein 

'".TAwWr;AI''J'>;[ 1 ^DLiHOVMO:jVMORLCL:>NLFHCLLLFI,RYYY:'KLVFfi 

ici.it *"M-:P:;Ly:nT.Yvr;PEY:;AAAOu;[^ 

i .y i Mvrtt ;nt :K#EK[;rYi:vNos agyr vyolkgle yk elo< ; 1 1 : iyrvalcsgnqe i vsrrh 



CPn.OO'i-l 

vilS-ValyL tRNA Synthesis 

WRVFLSRDHKFCLRrMTTEDFPKAYNFOO^EPELVVFWEKNnMFKAEASSDKPPYSVIM 

p p pnvtgvlhmg hal vnt lq dv l vr y k rmtg f evcw r pgtdhag r atqawerhloaseg 

KR RTDYSR EDFLKH I WAWKEKS EfCA* L^0 LRQ LCC.TC WDR K R FTM E P LANR AVKKAFKT 
LFENCY I YR^LVNWDPVLOTALADDEVr^ EEKC WLY Y I RYRMVCSQES I WATTRPE 

""."i. ;pt " : r ir 1 - pya::w : " " :\ r \-~r"^^rr~\\'?\ w rrwr.y~ 

»y',m( ""rjifiii . I'M f F) i ;.Tf'. '' • i "i: '• '■/■ "' :makk--'a;'!-:» \; jm-vrke^yf:..- 

VWJ YRskwW i£PYL3KWWryjV:^r ^\JALi'.EFVL.J^u; K 1 Fi'KL'r VKMYLJWVNHLRSW 
CISRQLWWGHRI PVWYHKTJDDER/LCYDGEG I PEEVACDPDSWYQDPDVLCTWFSSGLWP 
LTCLGWPDENSPDLKKFYPTAL/vTGHD I LFFWVTRMVLLCSSMSGEKPFSEVFLHGL I F 
GKSYKRYNDFGEWSY.ISGKEK^YWIGEALPDGWAKV^KLSKSKGNVrDPLEMIATYGT 
DAVRLTLCSCANRGEQ I DLD/RLF EEYKH FANKVWNGARF I FGH I S DLQGKOLLAG I DED 
S LGLEDFY I LDGFNQL I HOl£EEAYATY AF DKVATLAY EFFRNDLC STY I E 1 1 K PTLFGKQ 
GNEASQSTKRTLLAVLLI^Vl^VLHPVAPFITESLFLRIODTLGALPEGDGDAfTCHALR 
ML RS RACMEAPY PKAFDVK IPQDLRESFT LAQ RLVYT I RN I RG EMQ LDP RLH LKAFWC S 
DTTEIOSCIPILOALGpLESIOLLDKEPEKGLYSFGVVDTIRLGIFVPEEHLLKEKGRLE 
KERVRLERAVENLER^DESFCQKAWPNLWAJ<CEALK>^rEIXX5ILDKLASFA 

CPn_0095 / 115956 118790 
pknD-S/T Prorein Kinase 

ac i vcldredq9s ler yd i vr i igkggmg evy la yd p vc s r kv alk kir edlaen p llkr 
rflreariaadlihpgvvpvytiysekdpvyytmpyiegytlktllkswqkeslskela 
ektsvgaflsd fhkiccti eyvhsrg r lhrdlkpdni llglfseav ildwgaavacgeee 
dlld idvs beevlssrmt i pgr i vgt pdymaperllgh pas k std i yalgwl yqmltls 
fpyrrkkg^kivldgqripspqevapyreippflsavvmrmlavtipoeryssvtelkedi 
eshlkgs/kwtlttalppkkssswklnepillskyfpmlevspaswyslaisniesfsem 
rleytl/kkglnegfgillptsenalggdfyc<;ygfwu4ikertlsvsl\^sleiorcs 
qdles/ketflialeqhnhslslfvtxritwlihmnylpsrsgrvai ivkdmedi ledig i 
fess^sli^vsciavpdaf'laeklydralvlyrkiaesfpgrxegyearfragitvlekas 
tdn^eqefalaieefskiiidgvaapleyi/akalvyqrlqeyneeikslllalkrysqhpe 
ifilij<dhvvyrlhesfykrdrijy:vfmilvleiapoaitpg<: 
jsdptvlelrsskmelflsywsgf i phlns lfhrawdqs dvral i e i fyvac dlhkwqfl 
£sc i difkesledqkatee ivefsfedlgaflfaiqs i fnkedaek i fvsndqls p i llv 
y i fdlfanrallesqgeai fqaldl i rskvp enfyhdylrnh e i rahlwcrn ekalst i f 

' EJTYTEKQLKDEQHELFVLYGCYLALICXjAEAAKQHFDV^ 
KDALSYQERRT J J .RQKFLYFHCLGNHDERDLCGTMYHLLTEEFQL 

CPn_0096 124347 118837 

CT296 hypothetical protein 

etflsi lreffmkslpvyvsg i kvrnlknvs i hfns ee i vlltgvsgsgkss i afdtlya 
agpjcryistlptffattittlpnpkv^eihglsptiaikq^fshyshatvgsttelfsh 
lallftlegqardpktkevldl yskekvlst imels egvq i s i lap llrkdiaaiheyaq 
c^ftkvkcngtihpiysfltsgipedcsvdividtliksenniaj^kvslftalefgegh 
csvlsdeelmtfstkqoi ddvtytpltqqlfs phalesrcslcqgsg i f i s i dnplli de 
• nls ikenccsfagncssylyht iyqaladalnfnletpwkdlspeiqni flrgknnlvlp 
vrlfdqtli3kj<nltykvwrgvlndigdkvryttkpsrylskgm 
svatwegktftefc^msltwwhwfskvkspslsiqeilqglkqrlsfl 
raiatlsggeqertaiakhlggelfgityildepsiglhpqdtekligvikklrdc<3nw 
ilveheermisladriidigpgagifgge^fngkpedfi/insssltakylrqeltipip 
es reaptswlllteat i hnlknls i rlplarl igvtgvsgsg kssl inntlvpa i esflk 
qenpknlhfewgcigrlihitrdlpgrsorsipltyikafddirelfasoprslrogltk 
ahfsfnqpcxsacic^c^lcriwisdddtpipcsecc^kryhsevleilyexjkni^ 
tayeaekff ishpki hek i halcslrldylplgrplstlsgge iqrlklahellfaspkq 
tlyvldepttglhthdiqal i evllsltylghtvlv i ehnmhvvkvcdyvlelgpeggdl 
ggyllasctpkdliqlntptakalapy i egsldi rwks eppss pkscdi l i kdayqnnl 
khidlalprnsliaiagpgasgkhslvfdilyasgniayaelfppyirogllketplpsv 

GEVKGLSPVI SVRKCSSSNRSYHT I ASALGLSNGLEKLFAI LGEPFS PLTEEKLSKTTPQ 
T 1 1 DSLLKS YKDDYVT ITS P I PLGSDLE I FLOEKQKEGF I KL YSEGNLYDLDERLPLNL I 
E PA I V IQHTKVS PKNS SSLLSA ISVAFSL S S E IWI Y I SQKKQRKLS YSLGWKDKKGRLY P 
E I THQLLS S DH PEGRC LTCGGRG E I LK I SLEEHKEK I AH YT P LEF F SLF F PKS YMK PVQK 
LLKDE14ASQPLKLLTTKEFLNFCRGSSEFPGMNALLME0LDTESDSPLIKPLLALTSCPA 
CKGSGLNDYAhTYVRINNTSLLDIYQEDATFLESFLOTIGTDDTRSIIODLMNRLTFISKV 
GLSYITLGQRQDTLSDGENYRLHLAKKISINLTNIVYLFEEPLSGLHPQDLPTIVQLLKE 
LVAm T ^m r IATDRSCSLIPHADHAIFL^PGSGPC<X^FIJ^DSDTE^CPSVDLHA^POTEV 
CPKAPLS ISKANHTRGSDRTLKVNLS IHH IQNLKVSAPLHALVAIGGVSGSGKTSLLLEG 
FKKQAELLIAKGTTTFSDLVVIDSHPIASS0RSDISTYFDIAPSLRAFYASLTOAKALNI 
S STMFSTTTrKOGCC S DCQG LGYQWI DRAFYAL EKRPC PTC SG FR IQ PLAQEVLYEGKH FG 
ELLHTPI ETVALRFPF IKK IQKPLKALLDIGLGYLPIGOKLSSLSVSEKTALKTAYFLYQ 
TPETPTLFLIDELFSSLDPIKKQHLPEKLRSLINSGHSVIYIDHDVKLLKSADYLIEIGP 
GSGKQGGKLLFSGSPKDIYASKDSLLKKYICNEELDS 

CPn_0097 124549 126006 

pyk -Pyruvate Kinase 

DSMITRTKI ICTIGPATNSPEMLAKLLDAGMNVARLNFSHGSHETHGOAIGFLKELREQK 
RVPLAIMLDTKGPEIRLGNIPQPISV^OGQKLRLVSSDIDGSAEGCVSLYPKGIFPFVPE 
GADVLIDDGYIHAVWSSEADSLELEFMNSGLLK3HKSLSIRGVDVALPFMTEKDIADLK 
FG V EONMDWAAS FVR YGED I ETMRKC LADLGN PKMPI IAKIENRLGVENFSKI AKLADG 
IM I ARGDLG I ELSWEVPNLQKMMAK V.S R ETC HFCVT ATQMLESM I RNVLPTRAEVSD I A 
NAIYDGSSAVMLSGETASCAHPVAAVKIMRSVILETEKNL3HDSFLKLDDSNSALQVSPY 
LS A IGLAG IQ I AERADAKAL I VYTESGrjS PMFL3KY R PKF P 1 1 AVTPSTSVY YRLALEWG 

VYPMLTQESDRAVWRHOACIYGIEQGIL.'INYDRILVLSRGACMEETNNLTLTIVNDILTG 
SEFPET 

CPn_f309R 127494 i:»hO'»l 

No robust homo log present in (^n^^ink/ KMBL, <is or LL/7/9H 
LVGKKFHQ I KRT I LEAPLYYLV3G [ I Al/ :R| ITfRSFI.Tf !U IKGFGFIAFY I r ,'JDYRKTAL 
TNLALAFPEKTFDERYK [ARO^LQHL ■ ITLI.E I.[J\ [EOLV^N IDKL IT TVT^RNPKGFS 
. SREV.r .INEpLEETFKNl.QEKCf JL [-LF':':;i lOAWKMiFI-.Y ITKNYIW HAFAKAt KNQRUTK ^ 
K T FALPEVFKGK fVPPKNi; tooc I F.A! .1 Vj ;ki .7' ', rVC;r/OALL.M::::Y*rYPLFf;:;PAFTTTS 
PAI..L.AVKTCFPV I AVTJV! '.ROAKGF RV ■ i-: )AK I .YANKSI.I'MKK: :VA f UAIAJMW iFLEKC t A 
:;0l'F'>WWIHKRWKPKt:'W[KKKYR\^':M[|^|.-7!\-)V:::;HF:;KirAIA^^ 
< INADIILEELOEOFPEYnL rOI.KNtX?L/ 1 1 ,A! .1 fK VPA I f-^LTFJN! ij\ I [ . Y K H F R K1 v I: X* A\A' 
::KI'FI.EK:;LDHPOAPr.KN::[,l<tFY:'.Ktll.KI)KKI'KNFKVK:;K':i'l<I.TVF 

M«> tMhu^r homo Li hj t>i,»;ciii i;, t ■h.nik/KMIil, ut L I / V / ■ I H 

YY' 'A.'IYYLK I::«FAAKMAPKKNM[.N:; < 7'ITPTI'TAATI .1,1 1 -KV I l-KAP:;TPVM t KM IS I K 

i-:p i avhak::padtvatfai.d::kl: ;Fvyyrvi . \ aa: :k i -u i ro:: ikiii kfi i .tkk 
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CPn 0100 t2'Ji*>3 1278H2 

f^F^TO^^^ 

^i'.'- llil'iV:! i:i\&iAiVr>,pW^i'Mi-^-rr':.i-K:.Ai.r-rA£:- :Kprj,r.s ae ::-:nti^j- w 

TKTKETTKLYKKEW 

CPn 0101 129986 123141 

Sf^a^yqlse^ 

S^DIL^YAR^ 

SLSRDGLLTRGVKIDRFKAVLRSILSPKEHKRKPLFSWIWKR 
CPn_0102 130099 131466 

FWVG IF ALT FVLGWTG I MQ I F S FG S NWANFS EYTGN I FGTLLG S EGVF AF FLESG FLG I 

[^gp^^skS^stcmvalgahmsafwi icanswmotpsgyemvmhkgklipaltsf 

^WF^ P^TT^OTFI^VLGTWLSGVFLVI SVS AYYLWKKRHHEFAKQGMKIGT ICAVIVL 

fIw^^vtgl^ 

SffKiMFS^^ 

FS LVF I ALLTLF ITVLCKKI KHGPEEENDLTEFEVK 
CPn 0103 131465 132511 

PVWDC^EV^VI^GGLFAGFPACT 

wkifwdiificsgtaisfflctivc^ilglplspdtsyaslswilffrpyaalcgavva 
l^llialtscccvaaktsvskkrygyafiystlnllslilsaatltfpnilxstvdpqy 

SYTIYNSAVETKTLKSLLIIVLIGLPFIITYTCYIYRVFRGKTNFPSIY 
C?r^0104 133884 132676 

Sl&RMLWMLLL^^ 

DGTX IREFS KGDLVAV IGESKDYYVI SAP PG ITGYVFRS FVLDNWEG EQVNVRLE P ST S 
AR&VRLSRGTQIQPASQEPHGKWLEWLPSC^FW 

AMDL INSALNFAH IELEKSLNEI DLEAIYKKI NLVQ SEEFKDVPG IQGLIQKALEE IQDA 
yMkslesqnts IASSQCSTPKVSSSEVTTSLLSRH IRKQTALKTAPLTQGRENLEYSLF 
RI^ASMC^NDHSEALTQEAFYRAEQKKKQVtAGVLEVYPHVVKNNre 
FIsr^TS INLEQWLGKRVTVECLPRPNNHFAFPAYYWGIKEAS 

CPn_0105 134883 134029 

CTjfis hypothetical protein 

YVWRKFSNQNPMLLIYCKKKEIHIX3WPCJTAKIRFTPKIA^IKVKINDQL^CIPPFISARW 
sdlA^IESQEGEmDC^LRi^LIDGKIISIPN^DOSIIDIAFQEHLLYLETSQSGKEDS 
RcSdKLGVGVLMNVLC^ITKGNDIQVLPKNL I SPLFSGTNPIEAILQHTPEHKDHPDAPT 
DV^EKMADVIRVLSGN^TLLPRPEPHCNCMHOT 
QSGDKLYIVTNPLNPSDQFSVYLGPPIGCTCGEPNCEHIKAVLYT 

CP|U0106 135073 136374 

pt&H-ATPase 

EKVRTQMKKTMVIDTSVFIYDPEALFSFENTRI I IPFPVIEELEAFGKFRDESAKNASRA 
LS^^RLLLENAKTKVTDCA^LPSGSELRIEVAPLSNDDRRGKLLTLELLKIIAKREPMVF 
VTKSLGRRVRAEALQI ESRDYESKRFSFRSLYRGFRELQVSQEDI ENFYKNGYLDLPLDV 
VS%"PNEYFFMSAGENHFALGRYYVSEGKI I ALKAMDKSVWG I KPLNTEQRCALDLLLRDD 
VKEpTL IGOAGSGKT I LALAAAMHKVFDKETYNKVLVSR P I VPMGRDIGFLPGLKEDKLM 
KW&fefp I YDNMEVLFSINQMGNSSEALOALMDAKKLEMEALTY IRGRSLPKAFI I IDEAQN 
LTipFEElKT 1 1 SRAGKGTK I VLTGDPTQ I DSLYFDENSNGLTYLVGKFHHLALYGHMFMTR 
TERSELAAAAATIL J 

CPn_0107 137321 136392 ' 

CT058 hypothetical protein_l 
KKSPPPVTPKEIPTOPKPPIPQRPEVSPTPTDHIVPGSIEASPILGKKPSPDSMVSPLSL 
FH KMLLENWTPVEEP F PWP P AEKNQK I F AWALNQS KLI FVSTSGN I AQPRLVTDSMSjM I 
VNAANRTMSRIX3AGTNQVLSAAVSVDSWLSQRPL^PERC^TPLNEGECRAGMWRNy)GS 
NHTGKC^KPHYLAQLLGPKAVDHHNKSQAAF-DRCKNAYUJCFSIAQTLGVTFLOIPjtlSS 
GIYAPPENRKKPNSEENKVRMRWIHAVKCALVAAMQEFGNEPGNTDRRMLr/LTD^CTPA 
ITDPKKK3HL 

CPn.0108 137887 137303 

CT0T3 / 
KNLFHYKAI LMS I FNEEVF I ISHRHTPLGQTSTALRNTPLVNPLHRTNLOR l£SYI P I FS 
TFIGIKTLKGISSUJYSMVU^GNFSSVCKTLPCPEIYEELPKVRKEAWLEIfGIKALYY 
LVLGVIKI IKLIVRYLCPCCRPPEPREPQNPLTPTPLDMGQQIDAIFSTPT3PTGFKDPF 
LDDLLQEDKKKAPHL 

OPn_0109 138646 141783 

i leS- Isoleucy I -tRNA Synthetase 
RQKMTADEVCKMSFAKKEEQVLKFWKDNQ I FEKSLQNRQGKTLYSFYD^PPFATCLPHYG 
HLLAST [KDWGRYATMDGYYVPRRFGWTXTHGVPVEYEVEKGLSLTAEfcAIEDFGIASFN 
EECRK I VFR YVHEWEYY INR IGRWVDFSSTWKTMDASFMESVWWVFQELYNGGLVYECTK 
WrF^TA[/lTPL:;NFEAJO^KE^DPSL\AmHPU3NDSASLL\AVTpPWTLPriNMAIAV 
< 1KTLVYVR [QDKK.'JGEQW I LSQOCVSRWF^NPEEFV I LESFSCKDWGRTYEPPFTFFQS 
K R EEG A F l< V t AAS FVF.ES ECTGWHMAPAFG EGDFLVC K ENH VPLffC PVDA1 \ r iC> FT EE I P 
MYCWIKMADKEI [KFLKKEGR [ FYltGTVKHRYPFCWRTDTPL/YKAVNCV/F/AVEKIK 
r)KMLHAN:;:;iHWVPyi[OlV,RFGKWLn^RWAl:;RHRYWCrrpyPlWKSAiyJEILVVG^I 
KELEELT' JTrj ITD IHRliF [ DDLN t VKIXIKPFHR E PYVFDCWFD/iIAMPYAQI IHYPFENQK 
. i™rEEAFf J ADF T AECjt.Df.lTRGWFYTL.TV r;'AU'iFDRPAFRNAI WG 1 1 LAEE/iNKMSKRLN 
NYP:;i'KYVLr/rY^AUAl.R[AXLniJ\AA/KAEDLRFSDKG[Er;v/ J KOr[ J LPL,Ttr/LSFFNTY 
AK[.Y f iFDt'KSOH [ LPAYTE [DOW t L:jNLY:/VV(lKVRF.rJMS(j/l!LNFAVEPF"/TFIDDLTN 
WY CI* l-*f 'HRRI'TfJEAEDTPnRRAAFl'TLYEVLTVFCKV [APFyPFLAED I YQKLKLEKEPES 
VII[/70r-U\iVmryK[[.Pn[.F.KRMtlDIRF.lVCU,tI::LRKEHK/.KVROPtANF , r/ , /GSKDRLS 




SDFQGENWDINCHATQ tEir/33 IDo 
CPn 0110 14 J755 

S^?[™aAro^ 

FI^QVPKG^^^^ 

PTTLSGYLVSG I AIATGLsit IGYVYTQKRRRLFPKKEEKNHKK 
CPn_01U /44761 143934 

viSlespsq\Ws^secsqffsl^^ 

WVS I ETPKGTS I VRiVD ighgats pyvyslpdsktq 

CPn_0112 / 144743 145093 . 
rr*fR- fPPtii2/ Glu tRNA Gin Amidotransf erase (B Subunit) 

SHPfIt^^ 

CPn 0113 / 145329 146405 

r^f r&-P*»nt /de Chain Releasing Factor (RF-1) 

g£^asyl£r^^ 

ADDKOALA^PEMVVMLEEGIt^ 

ctSrvqr^ctetorvhtsaitiavlpepseeoteixin^ 

^OSAV^THLPTCVVVTCQDERSQHKNKDKA 

GSGDR^ERI RTYNFSQNRVTDHRIGLTLYNl^KVMEGDLDP I TTAMVSHAYHQLLEHGN 
CPn_0/114 146371 147261 

W*SYS^I^iQECTAYLDYYGVPL^ 

MLMEYRKRLALRGQRC PTAYLNGAVS FLGLRLRVDS RVL I PRTETELLAEY I INYLLSH S 
E^^FYDICCGSGCLGLAIKKSCPHVEVVLSDVCPQAVAVANENAKSNGL^ 

a^rp^Wnpfylsfttciihidpevrcyep*^ 

leigssqgesiknifskhgiygrlhqdlsgrdrifflemdgrdpvssgays 



.0115 



147279 148622 



il 1 ✓^EfK^ignal Recognition Particle GTPase 
V/V^inslVqklssifsflvssrrineewisesirevruu^ 

GEEIV&SPGQQFIRCI^EELVAFLSDGREEFTIOKTPSIILXCGLQGAGKTTTA 

dyvi^kkaWlv\^dlkrfaavdq^ 
ngbrofvildtagrlnidnelmefjl,taiokvsqanerlfawrvamgqd 

WILSOTDGDARAGAVFSIKHVLGKPIKFEGCGERIQDLRSFDTOSMAERILGMGDTI 
n r VKEMREY I S E EEDAELGKKLVT AAFTY EDYYKQMKAFRRMGPLRKLLGhW PG FNNAKP 
SQKE I EDS EQQMKRT EAI I LSMT PEERKELVELDMS RMKR I ASGCGLTLGDVNQFRKC^S 
QSKKFFKGMSKGKMEQVRKKMSGGNQWR 

CPn_0116 143592 148972 

rsl6-S16 Ribosomal Protein 

EKNVRRKSVALKIRLRQCGRRNHVVYRLVLADVESPRDGKY I ELLGWYDPHSS INYQLKS 
ER I FYWLERGAQLSSKAEALVKQGAPGVYSALLSKQEARKLVVRKKRRAYRQRRSTQREE 
AAKDATK 

CPn_0117 14S983 150071 

trmD-tRNA (guanine N-l> -Methyltransf erase 

TGMKIDIl^LFPGYFDGPL0TSILGRAIKQPXLDVQLTNLRDFGtX3KWKQVDDTPFSGGG 
MLLJ1AEPVTSAIRSVRKENSKVIYLSPQGAIXTAEKSRELAAASHLILLCGHYEGIDERA 
I ESEVDEEI SIGDYVLTNGG IAALVL I DAVSRFI PGVLGNQESAERDSLENGLLEGPQYT 
RPREFEGKEVPEVLLQGDHKAISOWRLEQSERRTYERRPDLYLNYLYKRSIDHKFDEETT 
TNRDHFKCDK I SWLEVNK LKRAXNFYCKVFGLDAMSCENKFCLPH EGKT I FWLREVQAE 
KKN I VTLSLSLDCACEEDFCYLLRRWELFGGKLLEKOADEHAVWALAQDLDGHAWI FSWH 

RMK 

CPn_0118 150075 150464 

rll9-L19 Ribosomal Protein 

KKaiFRWYIMVNLLKELEOEOCRNDLPEFHVGDTIRLATKISF^KERVQVFC<nVMARR 
GGGSGETVSLHRVAYGEGMEKSFLLNSPR IVS I EIVKRGKVARARLYYLRGKTGKAAKVK 
EFVGPRSSKK 

CPn_0ll9 150520. 151164 

rnhB-Ribonuciease HII „ w 
LMNTS ISEIQRFLSMI AFEKELVSEDFSWAGIDEAGRGPLAGPWASAC ILPKGKVFPG 
VNDSKKLS PKQRAQVR DALMQDPEVC FG ICV I S VER I DQVN I LEATKEAMLQA I SSLP I S 
PDILLVDGLYLPHDIPCKKIIOGDAKSASIAAASILAKEHRDDLMLQLHRLYPEYGFDRH 

KGYGT.1 LHVEAIR RYG P^FTHRKSFSPI KOMCAIV 

CPn_0l20 i > L 1 2 5 1 1 1778 

gink-fJMP Kinase 

EKLF::NKANVCYCMNKIL\'?::PFnPD(fyKCCl'KLFTTSAPAfJVT'.KTTLVRMLEQEFSSAF 
AET[';\mRKPREGEVl\lKL , YHfVSHEEFOR[.[ J DR0ALLE«VFLR?ECYG , r'3MLEIERIW 

:;u;khavav i d cc^alf i r.-rmp^v: ; t f r app weelerrla^rg^eeg^orkerlehsl 

f E I .AAANQ F L YV I INDDL^OAYRVL.K:: [ F lAKEIIRtl ft* 



L7f*/i 



CTOil I lypor. ht;t Li.'.i 1 pior<jiit 

KH f M t KKDRFTNKKLNKLF'.^'f'F.'JLVf IYA t K'^AK I Y. [AKf ;OVl'.S:*NVA [ LTl.VLLDREtt I 

fji'Km-:K i\att A::i'rvKKKK::Kiri , N:;i'KKOi'::AY'rv;:;DVK 
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inorG -Met hionyl -r.RNA :Jyntht;t.ir»<? r 
f 'KVMPOKVL ITSALPYANCPLHFGH I AC^LPADVYARF^^DDVLY ICGSDEFCIAI 
T^A^ECLr;YOEr^)^HKLHKDTFEKU;FALDFPGRTTNPFHAELVQDFYSQLKASGL 
rEN^SEOLV^EQEQRFLADRYVECTrCPRCGFDHARCDECQSCGADYEAIDLIGPKSKIS 
C^LW^EI^IYFLLDRMKDALLSFIOGCYLPDHVPKFVVDYIEHVRGRAITRDLSWGI 
PVPDFPGKVFYWFDAP TGY [SGTMEWAA5GGNPDEWKRFWLEDGVEYVQFIGKDNLPFH 
;^VVFPAMELGOKLD7KKVDALW^EFYLLEG^ 

;'!. : M ','^^[ ' ■l^'Vwi-'\---\'\.y;^ VtM ;1.a/': W/yfmcoafw^/--k^tp.ep.veai 

LFCACYCQKLLAL 1 3 Y P I I PE3AVA I WEM I j PKSLENCN LDTMYARDLWKEE I LDV INEE 
FHLK5PRLLFTTVE 



CPn_0123 



155975 153774 



recD-Exodeoxyribonuclease V (Alpha Subunit) 

NSMEKICCYLEqLvENKDSGDITAYIKIPNKTTPILI^ 

spsotkyfqiSydspllyeyrgvfhyltsklikgigpkiaekiiekfqektcyvlditp 

ERWEVSGISCTRCVSICKOLCEQKILW^ 

ei>pfu^emex;igfktadfiamklgvprnsesrlcagiqhsleexqeeghtcypieixi 
dw akllnq dvf dtp i tle e i dtq i lnmqkrkllh i qd i sgt lhwtr ylhlaekt iv sd 

LKRILFSSRRIRSIDGEKAIAWVEENLSIDLAEQQREAIKACFSEKLLI ITGGPGTGKST 

ITOAILKIFEQVTHKIILAAPTGKAAKRMTEITQKHSVTIHALLQyDFKTKSFRKNHDNP 

IDCDLIIVDESGMMDTHLLHHFUCALPDYTTLVFIGDIHQLPSVGPGNIU^LITSNKMT 

VI RLNK IFRQVHDSGIVTNAHRVNEGELP ILYSETGRRDFLFFQKDDQEEALNHI IHLVT 

KFVPQKYHIYPQDIQVIAPMKKGTljGIYNLNKALKHALNPKKANLiiGRFQS 

I RNNYNKEVFNGD IGYVST INFEDKAVWRMEGKHVGYSFSELDDLVLAYATSVHKYQGS 

ESPCI 1 1 PIHTSH^LYRMXYTAITRGKKLVILVGTKKAIAIATRNNRVQHRCTGLAE 

VLKELDTKKNYADL 

CPn_0124 156575 158068 

No robust homolog present: in Genebank/EMBL as of 11/7/98 
IRSKQRTVAITLLVLG ILL I ASG I I FLAVAI PGLSS AVALGLGCGMTALGTVLL ITGLVL 
LIRSEKIJU^EQVEIKQARTRVNNELDQLSQYVFYTEW 

TNLEQD I EE I FLTLRDIRNALDNEEFFWTHAKQCLAQVGESLFQDAS I DEF INLAHLS E I 
RQHLDINDPRWSMITKKWGTWRFIWSTMYKQIKSNFEKSDFGQ 
VLYOSFQKGYNRAAIXSE^TRIIHTSSLLHWEKDEDKHLNIKNECASRiEI^FKKFRTLFL 
GLSEEDVIDFTGASGWDCSKLPRKEVPLOGGKKKIJlFKRTFADF^VGCWDRTTSLEHWrP 
" QEEDPLDRIi<DQVEQEATSVLKDQDRYWKEIETSEAKFRSLPREDDFEKOSQIDSYIRDL 
DDHLSVWANQLSAAEDALIEVTWQEHGNRF>fIJ<NICXWLELIEDAVKATLPRVDFIQEL 
LEKEELPLVAARMSLENS 

CpWi3125 158072 158605 

NaKrbbust homolog present in Genebank/EMBL as of 11/7/98 
KIS&AE IMS EVKPLFLKNDSFDLATQRFQNL INMLQEQAEI YNEYEEKNARVQNEIKEQ 
KDFISCRC I EDFEARGLGVLKEELASLTRDFHDKAKAETSML I EC PC IGFYYS I HQEEQRQ 
RQERLQKMAERYRTCKGVLEAVQVEQ 

CPn2>126 158806 161085 

No^:feibbust homolog present in Genebank/EMBL as of 11/7/98 
' LLVfeYYCMGLFFFSGAISSCGLLVSLGVGLGL 
AP&LXOLEDASERLRVKASP^LASLPKEISOLESYIRSAANDLOTIKTWPHKIJQRLV 
SRKE^LAAAQNWISELCEISEILEEEEHHLILAQESLEWIGKSLFSTFLDMESFlj 
HLSETOPYLAWDPRLLEITEESWFAA/SHFI^SAFKKAQILFKl^^ 
EL&SfFIYKSLKRSYRFJjGCLSEKMRIIHDNPLFPWQDCS^ 
EKf FFWIJ)EECAISYMrxrVTOFLJ^ESIQNKKSRVDRDYISTKKIAIJCDRAilTYAKVLD 
PTTEXjKIDLQDAQRAFERQSQEFYTLEHTETKVRLEALC^CFSDLREIATNVRQVR 
NANDLKESFEKIDKERVRYQKEQRLYWETIDRNEQELREEIGESLRLQNRRKGYRAG 
GRiKSLLRQWKKNLPJ)VEAHLEDATMDFEHEVSKSEIXSVRARLEVLEEELMDMSPKVA 
IEEL,LSYEERCILPIRENLERAYI^YNKCSEILSKAKFFFPEDEOLLVSEANLREVGAQ 
KQYQGKCQERAQKFAIFEKHIQEOKSLIKEQVRSFDLAGVGFLKSELLSIACNLYIK s, * 
KE$ IPVDVPCMQLYYSYYEDNEAWRNRLLNMTERYQNFKRSLNS IQFNGDVLLRDPtfYQ 

pecShSt RLK ERELQ ett lsc kklkvaq drl s eles rls rr 

■ 5_K 

CPn"26l27 162152 161130 

ytffr^Cat ionic Amino Acid Transporter 
ES FMFPS ANQESRT RNVPLG I FHGLVACLYWG I VFV I PNFLGSFGDLDI VLTRtfT IFGIF 
SL I^AI KNPSVI KKT PLY IWRKSLLWTLL INPVYYFG ITLG I RYVGS A ITWI ASLAPT 
AVL^'SNTKQKELPYSU*FAISSVIITGVILTHLSAIiJLPTAASPLYSIIX»IAVIL5TS 
LWIWIRNQSLLEKHPNLTPDTWSYLIGISALIICLPMIIIlX)LCGITIwTHNLISHTP 
GS ERLL FLLLC S AMG I FS S AKAL I AWNKASLNLS P ALLGA ILIFEPI FG/VLTYLYSQS L 
PSLQEG IGI FLMLGGSLLCLVLFGRKVQKSLENSQVSSSNE 

CPn_0128 162262 163053 

bpll-Biotin Protein Ligase 
EDRGRMLRNQVLVYCSEGVSPYYLRHTIRFLKYYSTQEGAFDILftVDGNFLIKNPFWEET 
TR LLVF PGGADR PYHRVLHC LGTAR I FQYVS EGGNFLG I CAG A/FG S KM I YF YEPEGAP L 
CCARDLCFFPGTAKGPAYRGNFSWSPSGWVSPQLFSDFGLOYAMFNGGCFFEGSEGYP 
GWIESRYDDLrcKPASIVSRIVSKGLAVLSGPHIEYLPHY^TtMVKENVQKTREFLQRER 
TT LDR YCONLVQR L RQ P AFS KA DC 

CPn_0129 163747 163064 

similarity to CT036 
DEQY tLSH I HMDPRrFVTSEPLQKTYQKLQEKHVNNL^I ASQVSLTDLONKTQYENNLI E 
TTTNE ITYYFPWHNPD I LRSEWDP I SNQLYL I FKKBF I HYHNLFSTALERNQ I LL IDSL 
OTGSSNPIAROMELLAFLCVFEQLDYNEDETYTIEPRDYFNRFVYKNSOTAPQIQSFGLLH 
GY EEMS Y ASNN I RNVLTH5 I VLCS P I LYQ L IT EFD/TK I HADDFDCLI 

CPn_0Un 164251 163751/ 

No robuur homotoq present in Gen/ban k/EMBL .is of 1 1/7/^8 
:;:JMVKC:;:; t f llENKKPACt.LPESKFAAITKLSLAIL^LFUj t AAC ILIALSGLLPNTLL I 

rAL.'xrr; n vr^vnusLL [(rrot^JKnvyKDBQKPK.: rFPKETP.^LDPWLLNPLKNK iq^ 

[•TLLLDPr:; INLKNEt.FFPSFEEWKK [FLItpPDFLIK.'JALANWK CLE 
i TiiJJl i I 1 1-444 I Id/ViHt) 

Hi* rohu:;r iK.m^hui pr*!snnr ii/< ;;.Ti^t\ink/F.MH[, .u: nt I I / V / 1 > K 
l- , ::::LKKKRF::i.:'.lwMI-*f.l[ : 'FFT:;A'i , VRr:* rc.FLF.l.FMICNAM:;::::t-VY(\K IPSW I I.KTSVAQ 
KVFKKH' ;KC. IOVI .L:rr::VMLFlC.UJVTAI' t FPOYL I7FVLT ! AU.MLA I SLVf ,FLL IRSV 

it: :: :mvdri ,w< ■: :kk\ ;y ai .1 mikni ;Pi-/[ jvkrva 1 l LU'.;i'Y l kvhalwp: u ;ij I PEDPSQAA 
VI.[,l.::WIM-M^:;VIVKAiaj^l\M-:/l-:f;KY tL.PVl.PKI.::R[Kf<V::[.I.VFl.:;AKTLDDLNEO 
. :VHI ■! MI INKKKUT [ NKKA[<K^ 



ALLKNFCO 13 1 KDLKQFLV! ^^^^ 
CPn OH: 166561 

No^obust homolog present in Geneb/nk/EMBL as of i 1/7/98 
SM^A^PHTSVTA^ 

AFWGI I F5VLALVACVFFLYFFYFSSEEFKC|CS0EFRFLP I PAWSALRSYEY I SQDA 
rNDVIKDTMOLSTLSSLLDPEAFFLEFPYFN^rVNHSMKEADRLSREAFLILLGEITWK 
DCETKIL^PWLKDPNITPDDFWKLLKDHFDL^FKKRIATWIRKAY 

mknmi v .!,.. !-:*:!!Wr". va.:;.. 
CPn_0133 167349 /b6564 

CHLPS hypothetical protein/ 

NSSAYM FKLLKNLFL IGCC I VGYFWrfRKES IVEQWLSNRLHTQVTVGRVS I RTSG I K I RH 
IC IHNPLASERFPYAAEIEYADVRBBSISMLLTKQLEISELI IHGANFTIFPYDSHCTKT 
WSLVWKNFHPQKETPSI^WIDRA^IRRCLFI^RI^ 
HTSSAKELPKI^EALPSLLYLAI^ESLYHLNLPG^ 
DINTPGTPTEEIIGFIRGLFF 

CPn_0134 16/131 167467 

groEL-HSP-60 . 
FADYRKLRRTTMAAKN I KotIEEARKK I HKGVKTLAEAVKVTLG PKGRHWI DKSFGSPQV 
TKDGVTVAKE I ELECKHE^TMGAQMVKEVAS KT ACKAGDGTTT AT\TAEAI YS EGLRNVTA 
GAN PMDLKRG I DKAVKwVDELKK I S K PVOHHKE I AQVAT I S ANNDSE IGNL I AEAMEKV 
GK]^SITVEEAKGFET3^DVVEGMNFNRGYLSSYFSTNPETOECVL 
IKDFLPVLQQVAESG/PLL I IAEEI EGEALATLVVNRLRAGFRVCAVKAPGFGDRRKAML 
EDIAILTGGQLVSEKjGMKLENTTLAMLGKAKKVIV^ 

KKQ I EDSTS DYDKHXLQ ERLAKLSGGVAV I RVGAAT E I EMKEKKDRVDD AQH AT I AAVEE 
GIL PGGGT ALVRGI PTLEAFLPMLANEDEAIGTRI I LKALTAPLKQIASNAGKEGAI ICQ 
QVLARS ANEGYDALRDAYTDM I DAG I LDPTKVTRS ALES AAS I AGLLLTT EAL I AD I PEE 
KSSSAPAMPSAGMDY 

CPn_0135/ 169448 169143 

groES-10i6a Chaperon in .^^^nr^ 
MS DQATTIffllKPLGDR I LVKREEEEATARGG I ILPDTAKKKQDRAEVLVLGTGKRTDDGT 
liPFEVgVGDIILMDKYAGQEITIDDEEYVILQSSEIMAVLK 

CPn_0/36 171419 169569 

oepFrOl igopept idase 
KG^^SIifTTELKTEALPTRTOVDPKHCWDTTli^ 

SP&QIDNPESLLEIiSKKFSVERKUXJLYIYAHLIHDQDITNPEGESDYQSIWL^ 
DEISWIQPALIALSEEXVAALLSSSVLAPYRFYXiEKIFRLSPHTGTANEFiCILASSFA 
VSNKAF S SLSDAE I PFG I AKDSNG EEHPLSHALASLYMQS PDQELRRTAYLAQ FQRY 
. . DYRirrFANIXrTCKVQAHLFEAKARNYPSCLEASLFQHN I PTTVYINLINETKKHTSLIN 
/ RYFNUCKEAIJ^KEFHFYDVYAPISGTrSKNYSYEEGVDLVCKSI^PL^ 
LSNRWVDRYENKHKRSGAYSSGCYDSAPY I LLNYTNTLYDVSVI AHEAGHSMHSYFSREA 
QPYHDAQYPLFLAEIASTFNEMLLMEALSKSDOSKEDKIVI ITKTLDT I FATLFRQTFFA 
AFEYEIHSAAEG^PLTEEn^ATYGNI^KEFYGGVVTSDSLSALEWARIPHFYYNFYVY 
QYATGI I AALSFAEKILTQEPGALELYLKFLKSGRSDFPLN I LKKSGLDMTTSAPLDKAF 
AFITKKIDLLSSLLSED 

CPn„0I37 172263 171502 

ybgl-ACR family 

VC SMNVADLLS HLETLLSSKIFQDYG PNGLQ VG D PQT PVKK I AVAVT AD LET I KQAVAAE 
ANVLIVHHGIF>niCGMPYPITGMIHKRIQLLIEHNIQLIAYHLPLDAHPTLGlJNWRVAtI)L 
NWHDLKPFGSSLPYLGVQGSFSPIDIDSFIDLLSQYYQAPLKGSALGGPSRVSSAALISG 
GAYRELSSAATSQVDCFITGNFDEPAWSTALESWINFLAFGHTATEKVGPKSLAEHLKSE 
FPISTTFIDTANPF 

CPn_0138 174094 172700 

"hemL-Glutamate-l-semialdehyde-2, 1-aminomutase" 
TNS RLFLA I KDQ LLQNMWKLTKRNSMLNC SNQKHTVTFEEACQVF PGGVNS P VRAC RSVG 
VTPP I VSSAQGDI FLDTHGREF IDFCGGWGALI HGHSH PK I VKAIQKTALKGTSYGLTS E 
EEILFATMLLSSIJCIJCEHKIRFVSSGTEATMTAVRIJVRGITNRSIIIKFIGGYHGHADTL 
LGGISTTEETIDNLTSLIHTPSPHSLLISLPYhNSQILHHVMEALGPQVAGIIFEPICAN 
MGIVLPKAEFLDDIIELCKRFGSI^IMDEVVTGFRVAFOGAQDIFNLSPDITIYGKILGG 
• GL PAAALVGHRS I LDHLMP EGT I FQAGTMSGNFLAMATGHAA IQLCQSEG FYDHLSQLEA 
LFYSPIEEEIRSCGFPVSLVHQGTMFSLFFTESAPTNFDEAKNSDVEKFQTFYSEVFDNG 
VYLSPS PLEANF I SSAHTE ENLTYAQN III DSL I KI FDS SAQRFF 

CPn_0139 174686 174093 

yqgE 

S PTKNKLRD IMK I PYARLEKGS LLVASPD INQGVFARSV I LLCEHS LNGSFGL I LNKTLG 
FEI SDDI FTFEKVSNHNIRFCMGGPLOANQMMLLHSCSEI PEQTLEICPSVYLGGDLPFL 
QEIASSESGPEINLCFGY3GWQAGQLEKEFLSNDWFLAPGNKDYVFYSEPEDLWALVLKD 
LGGKYA3 LSTVPDNLLLN 

CPn_0l40 175140 174673 

yqdE 

PRSNQQKIFCMSLEKELLEETPLVLLNFYKLVSFCNTMGMILGTEEKKFAIYGHVSMGOA 
FQGADTEGHSPORPFAHDLLNFVFSGFDIQVLRWINDYKDNVFYTRLFLEOKDREFLYV 
VDVDAP, PSDSIP LALTH K I P I LCVKSVFDAWP Y EE 

CPn_0141 175817 175110 

rpiA-Ribose-5 -P Isomerase A 

HSSSAVEKDLHLHEKKCLAHEAATQVTSGMILGLGSGSTAKEFIFALAHRtOTESLAVHA 
I ASSGMSYALAKQLA r PLLNPEKFSSLDLTVDGADEVDPOLRMI KGGGGA E FREKI LLRA 
AKRSI ILYDESKLVFVLGKFRVPLEISRFGRSAI IEEIRHLGYECEWRLQDTCDLFITDS 
^NYIYDrFSPNSYPNPEKPLLKLIQIHrr/tEVGP/TEKVEWGSNSOCLI:'KKYSV . 

CPi\J)lA2 1^121 1753U, 

N(j trrtsir.r. hnmolim prt?f;*vnt in GHnftb^nk/KMBL .jr. or ll:7/0R 
fiHSYIJF.-KGTLEKFHFK I LOLL.^TRNfJ I LNFC. r ;HFE I .^RVnHDNA IQK I HSYPLKP I AEN 
U I NT! ..*; FK DLK I DY PK DS.^'K RFPFLYH [ r JP [ PLMK LW KY P/T 

r:PnjiM: i:7i4 7 l7r,;»M 

PRM' IMirrrJLKRPLKl'HFPWGrJFLRPEHLKKTPE.'JLKEG'J [,'jLDOLMUI l"L> I AHJDL [KK 
rjKA^V ;L:;r CTCX lEFRRA'I'WI lYDFMWf IFI I^V^flMPATEGVFFCX JERAM [ DDTYLTDK r:JVn 
IflU'FVrjMFKFVKALEDKnTAKOTLPAPAOr'LKOM I FPflM IF.VTRKPYPTNOEL IKDIVA 

^YWKvr^Di/( f DAc;c]n'r^MJXx: , rRO(;[/yijpi<v^\';w*rGn)F.Kf;r/^Di.iWY[.LtNNt..vrAD 
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Rt , DDLr/ML,llVCRCN , ^H.':KFFA:^."YDFIAKrLFI 
rcEKTVr.'LCLVTSKTPTLENKDEV EAR IHQAADYLPLI 
EOWAKVALVKErSEEVWK 



,EFDHER3GDF3PLTF I 
'0 FA3C E IGNK LTEE 



CPnJ)M4 



1770-12 1805^0 



tO/LG^FMEKF^DAV3 

N PHlJ^rVVVKOA L3R E PTWEG EVDPK PS POLOTLLRDAKQEAKTLGDEY I SGDHLLLAF 
., • v : .r/ M . /nrv: •! .Kr if.: tk : \- < ;: i!-: *: \-r: • ak: n-v .F.r v :knl7Al.af ^ 
.!;.'"..'■ www. iwrrniriiw.KKi-r.- mA\,\»iiMJhuvwr&.L*', 

KOL'rVLDMGAL I AGAKYRGEFEER * F IDEVHTLVGAGATDGAMD 

EKY^ I FHG\^TEGALNAAVLL SYRY I PDRFL PDKA I DL I DEAAS LI RMQIGSLPLPI DE 

DI^IVGIQMRRIAQRLKARRIN^ 
SKALLKGDI KPDTS I ELTMAKEVLVFKKVETPS 

CPnj0145 180717 182369 

SaIsfKIhr^M 

ILT^KV^GFSPEEISLIQKLSYPGI^LASLRGSra 

RADYY SNCLDI LALR I HAERQRYLDQS PCVPGTSEFHKAT I ^F^EEAWYPSKK 

e^sdefsflssvtdrkfgvcl^ 

^AGGRHLPTASYCDCLDLEDLQVRTPEEMIGLTFWQGSFAWKKXYKE^EAY^OE 

S^eweujgfvqii/w^ 
pgs>iye£asyeeeuckamkssmpccegq^^ 

SLHLRLCKILCDRHEYTKALKYFI IAERU4EDCCFLKKDNR5FALFYEVKKI ISKVAPQK 
ANTLLLMESER 

CPn 0146 182595 183095 

No robust homolog present in Genebank/EMBL as of 11/7/98 

iiwismsssevvfctvhglgfgglssksvvpfkkslsdaprvvcsilvltlglgalvcg 

iaitcwcvpgvil^icaivu^isijvlslf^ 

ggfsraapsgmglpgdgsprastpscleelqaeiqavtqaidqmsdd 

C8F0147 183213 183671 

NcJ^fobust homolog present in Genebank/EMBL as of 11/7/98 
KG(SPMAVQS I KEAVTS aatsvgcvncsreai pafnteerats I arsvi aai iawai sll 

GIpivviAGCCPLGMAAGAITMI-LGVAIJJWAILITLRI^ PSPGNNGEPNERN 

s4t|pleggvageagrgggspltqldlnsgags 

CSg|0148 183822 185702 

. pknl-S/T Protein Kinase 

Gi^VSSMESEKDIGAKFLGDYRILYT^GQSLWSEDIiAEHRFIKKRYLIRLLLPDLGS 
SQEJMEAFHDVVVKLAKLNHPGILS I ENVSESEGRCFLVTQEQDI P ILSLTQYLKS I PRK 
L?AlVDIVSQLASLU3YVHSEGIJVQEEWNLDSVYIHIL>K^KVILPDI£FASLIKER 
I Lp3F I SDEEhniESKIKERVLLHTSEGKQGREDTYAPGAITYYLXiFGFLPQGIFPMPSKV 
FsMtflYDWDFLI SSCLSCFMEERAKELFPLI RKKTLGEELQNWTNC I ESSLREVPDPLE 
SSQNLPQAVLKVGETKVSHQQKESAEHLEFVLVEACS I DEAMDTAI ESESSSGVEEEGYS 
LAl^SLLVREPWSRYVEAEKEEPKPQPILTEM^IEGGEFSRGSVEGQRDELPVHKVIL 
H SKELD VH PVTN EQ F I R YL ECCG S EQDKYYNEL I RLRDSR IQRRSGRLV I EPGYAKHPVV 
GVWi'GASGYAF^IGKRLPTEAEWEIAASGGVAAI^YPCGEEIE^SRANFFTAinTTVMS 
YPi^PYGLYDMAGNVYEWCQDWYGYDFYE I SAQEPES PQGP AQGVYRVLRGGCWKS LKDD 

lr£ahrhrnnpgavnstygfrcaknin 

CRru0149 ' 185706 187700 

drsSSf-DNA Ligase 

ERE^IKE ENSQAH YLALCRELEDHDYS YYVLH RPR I S DYEYDMKLRKLLE I ERSH PEWKVL 
WSTP^RLGDRPSGTFSWSHKEPMLSIANSYSKEELSEFFSRVEKSLGTSPRYTVELKID 
GIAVAIRYEDRVLVQALSRGNGKQGEDITSNIRTIRSLPLRLPEDAPEFIEVRGEVFFSY 
SfPtft I NEKQQQLEKT I FAN PRNAAGGTLKLLS PQEVAKRKLE ISIYNLI APGDNDSHYE 
NLQRCLEWGFPVSGKPRLCSTPEEVISVLKTI ETERASLPME IDGAVIKVDSLASQRVLG 
ATGKHYRWALAYKYAPEEAETLLEDILVQVGRTGVLTPVAKLTPVLLSGSLVSRASLYNE 
DEIHRKDIRIGDTVCVAKGGEVIPKVVRVCREKRPEGSEVWNMPEFCPVCHSHVVREEDR 
V 5 VRCVN P ECVAG A IEKIRFFVGRGALNI DHLGVKV ITKLFELGLVHTCADLFQLTTEDL 
MO I PG I RERS ARN I LE S I EQAKHVDLDRF LVALG I PLIG IGVATVLAGHFETLDRVISAT 
FEELLSLEG IGEKVAHAIAEYFSDSTHLNEIKKMQDLGVCISPYHKSGSTCFGKAFVITG ) 
TLECMS RLDAETAIRNCGGKVGSSVSKQTDYVVMGNNPGSKLEKARKLGVS I LDQEAFTN/ 
LIHLE ■ 

CPn_OL50 187759 192444 

CT147 hypothetical protein 
1 1 YYKFFYSYNCPY F I SFFVLLCVNMASS SNNSTKQDG I PSWVNPNVQWNRASQVGOTEA 

n5ltpeaotsrswfsdrkhflevldvsleemenndlkkysryktiiliatlvtvai7civ 
pismvfgipmwvpclilfgaglssaflshrlqskckeihlryrayq lyrqqllsqyypdlr 
kstlykystthvkpkkgfvgklvenlrpdlhknkddggaaadsrldfagygvkhyatdal 
l/ivsgvnsvekjrlaslimsvkndilndvgsrepidkaqrsalwsgkdiggeldpggil 
0 i 3rd i la icgycmnvgveakkai dqykkwylnsstfiawnpqlpai aqsyllecqrhld 
yaak i fq dls altt ahgtgq al edlds llc yydql i e3kgvgeki i asi hqkhhdlamqd 
::ct>dehlkkwsnlyhvfsitikeftegkleonewsrrorlrgklekskcsilgncrtna 
eyatksekkladyllqicdrepfltgmhkaiatgkaiqgkvegvisqhpekq/mmlrcsi 
erlegmlrredwgailqknedevl^lkgtmeaqu^fkdlvgtwegkyoefriknklskvl 
vy dftk:; ysnllnrlevlh aeg 3tddlvlhvdrms edlkkt i ee i dgnlfqvtpeelsll 

AkEYQtllJ^ELPLEVQEGNRLOEAICSEGVSQGmLUISI^RDEKrNKNreSSRKNLVA 
I AKQAR:'DARN t D:;OGLAPLIQRNRASLDN I LONMYLFNGS I RN [HALD^ETLVATGSNM 
[•':;/^HTFDWNlYTNL.LDVLEIQSKPAPAPMENrDLP'';ALPEEVQDAVAEDVf:GTHRLHHQ 
Viyi<Rt^\[JLKNM[SQLOK^[NKWGMAKArVLCrVA7LFCVL::AIF[CQMIL.':LLILSCVG 
LLLTOV'.'l'l. [FDR [SK^KEFEKOVLETAOSLrPATKILP.'jEFNNKDLNRLAKLODNLNLE 
' IPTWAKN IVSDt.EG [ rrKEK.SLKDLTKEFRKD:^K!ILNKR IKRRFKEGUjQEAPVVRPT 

I V'jU I Ki lAKVFAKLHRELEHLOKOKEEU I RCDALVOEPMCLCLEK^KYDNEKAHAAAMT 
KKVnKI.ONll)HI-OKNNLTYVRIONFFRTLIQEKIjf;RL^/QErDWi^\KELHELAAirYG 
NM , :;^K::^KL>f<AKK^FKENVLH[A(JKnO[.ELLEAYLrr/'rACC*nXRU<. ) MOAr;FRERILLNP 

I I ;AKI ii iKAKHTI .A: tR EKMLKTLGL^Y LTPFVR F. L >w ['F/iTQ: *f lYNyl LKVREQLFDIEQRL 

(j\ iy;rv::if,i >y aavqaalaayvr ki i ehl r v.'JTYGLGaoe' ;ot:;:;kvttlmrdlhaveelv 
KM';vi ; :m<iM<:;ivnjiRvii:;vuir.iiLR 



QKYMQ 1 LLDAPVS LLYGAFi 

weeakqrlea I aaelddlrj 

GIQEQYAEMQC I EDLELKQK f 
AA 



af^^Bt 

iLRi^^^PE! 

JCQKFEDLVKI 



194178 



LLNFTELN I AH5TKAAEEEAKRYVEEKGRGFETY 
CE ! PLANLK I 3 I FSDLNLREKVSVEKAALEEEIQ 
iKKLEALEERLLQICRRIDSSVDKQKELLCLLCREE 



CPnJHSl 

CYE^FWPR^MADILVIGANFTGL lU^WLIQHG I3VKVIDHRASPEDPSFLDCRKLP 

•■r f* iir Vr^ : ^P< •/ : w. ; vr; :,/;:. • .-• : t:'7.vnff;ip:-: : yni-kw : :.-v:eadn 

R^^PAIJ^f FLKGCRKFNTTG EEYYYPPHOALKYRSSD I IKMSPQDKEIHGPGPGMRAI 

DARLENGSFLLDPLKSSKHLLIFFKDIPDLKEAWEEYGEWIEI 

NSLFI IRPDRYIGYRTHTFKLHEIilSYLLRIFASEKTS 

CPn_0152 1952/4 194318 

lTkmrkv^vs^ 

DHNLIGVl^LPNTPTPEGGFr^FHGFRGTKFGGLTGAYRKUJRKFAAVGIATLRVDM 
AGCGDSEGVAEEVP I ETYLHDAQT I LETVQEH PDLNAYRLG I ^^^^oJ^rernuTTT 1 
PRDLNIKALSVWAPIAIX^/lXKELYENFSKHGEGDI ^2 V ^ KD ^^^ ^o!!^^umt^t 
LIRrQDHVTANSLFTKPY^WIDOTLVSRTC^LFKNTAPGRMTFISYPmXSH^^ 

APDLDMILDQIVSHFORX 
CPn.0153 / 195430 197892 

^YDPNLIEKKWQQ%^^^ 
nMRYKRWlGFSVLWMGWDSFGLPAEQYAIR 

EGREFATSDPDYYlSfrOKLFLFLYDQGLAYMACMAVNYC PEIXnVLSNEEVENGFSIEGG 
YPVERKhOQWIu/lTAYADKLLEGI^ALI^F^Q 

LEAFTTRIi)^LG)vSFLVI APEHPDLDS IVSEEQRDEVTAYVQESLRKSERDRISSVKTK , 
DDNGVCIHSNYwFCLNGLSGQEAKDWINYLEMRSIjGRACT 

ipiihfedgWpleddelpllppniddyrpegfgcgpu^odwvhiyd^ 
tytmpqwags5wyyij*fcdahnsqlpws 

VFYDAGLVST* EPFKKLINQGLVLASSYR I PGKGYVS I EDVREENGTWI STCGEIVEVRQ 
EKMSKSKLN^PQVLIEEYGADALPJ^AMFSGPLDKNKW 
TSSEVQDI^RMLVIJ^KLVFRITEHIEKMSIJ^IPSSFMEFUIDFSKLPW 
AVRVLEP iXpH I SEELWVI LGNPPG I DQAAWPQ I DESYLVAQTVTFWQVNGKLRGRLEV 
AKEAPKE^VLSLSRSWAKYLEN AQ I RKE I YVPNKLVNFVL 

CPn_0l24 197874 199202 

TSEFCP^RG^RIFKCFYWVLV^ 

PGEGfcvWFHGASVGEVRLLLPVLEKFC EEFPGWRC LVTSCT ELGVQVASQVF I PMGATV 
SILK^FSIIIKSWAKLRPSLVWSEGDGrfU^IEEAKRIGATTLVINGRISIDSSKRF 
KFLKRLGKNYFS PVDG FLLQDEVQKQRFLSLG I PEHKLQVTGNIKTYVAAGTALHLERET 
WRI^KlJlLPTDSKLVILGSMHRSDAGKWLPWQ 

i^pyglwsrganfsyvpvvvvdeigllkqlwagdlai^ggtfdpkigghnl^ 
/lifgphitsoseiaqrixi^gaglcldeiepiicjtvsfll^evre^ 
tasfdrtoralksy i plykns 

CPru^l55 199697 199488 

^No cobust homolog present in Genebank/EMBL as of 11/7/98 
NSJffiFGVPFLEKLKI SLI P I EEMRHELFMKTHNSSSNGFSNQEKG I RTYFKSDLLGYEDL 
RENINPN 

\CPn_01S6 200147 199770 

"^o robust homolog present in Genebank/EMBL as of 11/7/98 
IXKQKLLARHMMHNIWLSEEPGRSAFLGRTAFFPNKYPIAOGGVG I PSTIGNLFTIWYC 
FTFYRAATPQSDHPIXKGFILLERIJ<ELGAGFFYCDLRESNTTGFTLFFEGSNKGVlJa^ 

LFIRDE 

CPn_0157 ■ 200753 200298 

No robust homolog present in Genebank/EMBL as of 11/7/98 
FSFVIYKEAIJ^IYQFSPGASPNWOASLMAQLNSYFCLGGETVTRIISLRPSGLILAKKE 
KAWSTAEK ILK I LSF ILFPLVL I ALAI RYLLYNKFNKDLDRAVFF I PTE ITKAEELI I A 
KNPALVKEAALTVSPLFYSLPKKYQLMKVETP 

CPn_0158 201463 200894 

No robust homolog present in Genebank/EMBL as of 11/7/98 
PPKITLSINIDLLLEDLDTDSIPWPKLYLSEDFDFAYYPESKAIIDTVAKLEKNNPGEEF 
CLESKK I LARYLLEQLFKLETGLNFPTST I DGGR ES FL I EFS H ETKK PTVWA F I YFYYYH 
SNGPKLEKDFKQACCEVHNRLLNLGLK-^RPOAGAQNDGRNGGPYGPIGFLIVWEENYGSV 

LKDHGFIKDN 

CPn_Ol59 201811 201467 

No robust homolog present in Genebank/EMBL as of 11/7/98 
CCFGGETATRI FSMTPSGFSLATEEK7CVSTAEKVI K I LAL I FFP I ILI ALAI RYFLHRK 
FDRKCFVI PODTPKELELI LAANPOLVEKAAREVHPGFFALPTKYQSMY IQTSKG 

CPn_0160 203794 202127 

pfkA-Fructose-6-P Phosphotransferase 

TVELLSLNKSYFEIORLRYRPEILTLLETIRSKHIOETSSPPSPPPELQKHIPNLCRIPE 
VS I YTEOETS5KPLK IGVLLSGGQAPOGHNW IGLFDALR VFNPKTRLFGFI KGPLGLTR 
GLYKDLD I SV I YDYYNMGOFDM LSSC P.EK I KT EEQK KN I LNTVKQLKLDGLL I IGGNNSN 
TDTAMLAE\'FLAHNCKTGV ICVPKT I DGDLKNCW I ET3LGFHTSCRTYSEMIGNLAKDAL 
SAKKYHUFIRLMGOCA^YTTLECGLOTLPNIALIGELIATRKr^LKOLSEOLALGLVRRY 
KSGKTr^.TTVL I PECL [ EH \ FOTRK L I OELNVLI .ANGD.'jr: [ EK I L:'K L.T PETLKTFHLFPK 
Dl ANQLLI ARD. f ;HGNVRVSK I ATF.ELLAVMVKKI-; I EK I K PMMEFHSVJIMFFGYEAR.^GFP 
JNFr/INYG I ALG 1 1 SA t.F L VRQ KTGYM IT t NNI AO^ YTEWQtt :ATP1 .YKMMH LENRCGTE 
TPV r KTDSVPPK.1PAVON [.UtfjClDrS: I -7F.DLY HFPG PLQYF< IK EEL I rx^RPL.TLf.WENQT 
MSr'PjALYSTiJGKRSL 

{pf.yliccod Acy lti.ifi:'t(:i.i:;<: L.tnuly) 

Hr^ , ;:;T^RKOE[ J LLr , [H;VVLt-:KHEgRTMF:;LT[ J LNNF-XrF^l.l.ll , IMM J ltYNPPYf'IV[LLHG 
IA:;[^KTGSKR:;HVRtAOELTRLf;rAAL^V[M.l>;iIGDCEf;FlWnF:n.i:NYKUMtHE[ IEYT 
n::i.l.H I DOERLA t Ft'iliJUIGTLAiyyr ! ,1'FFNK I KA[ .AVWAl'Tf:!* '.Kl WAAKAOKNAPEV [ 
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TM: JUKG A IT YACMTLNPDFYTQFLk I D t VKELMPSARJj^^^YMQCEQDLLVS INHRTL 
rT EA FANODK P tT I LTY PDVOHAFP F AESS ALS DLTQ^^BrSC E 

CPn_0U2 205870 204803 

No robust homolog present in Genebank/EMBL as of 11/7/98 
FVYTLVNIQSPFRIMKLYSISSDVDTPWIFQL^KVDSYLFLCGNRIKWSIVMQEPNLI 
[ C KVENVR I5TIVKILKILSFLIFPLIL I ALALHYFLHAKY ANHLLVSK ILERAPQYVP I 

ix;rscdtashyklttlvpv!»oknloamgsnpleveaalrttkpsffcvpakyrqii ISSH 

: :; ;-i LULL. ;v: 'WtTKY I .:!: JTMDP \ IKAOKRV l-JTWONLPTnTY wmY.?. 

. m i lf : r> ■ ! TOF-ru't-iAt-r-rii rr.-y ;pi ,tt.f ; ■ *vry t ysi iftponpt iwpcvffrqc 

PLDEDRGGGFEILEOLQELGVRFPICPSQGPDNPNFQGFQGtRIYWEDSYQPNKEV 
CPnJH63 205831 206394 

No robust homolog present in Genebank/EMBL as of 11/7/98 
FEKAIVYCIKCKQI IKCIS I IHTPTPATPLCTEGEIFPGPVDSAIQNDLERLLTVKKRPD 
1 1 REYLRAGGSLVTTYPKEGQRLRSPEX2LRVLDDLVQSYPNHLHAIELDCGAI PQDLIGA 
TY I ITFADFSTY I LS LRSYQANS PSDDTWG IWFGS I DDPVQAVI SFLKDHGFALPSTLAQ 
DPLLCTNK 

CPn_0164 206444 206998 

No robust homolog present in Genebank/EMBL as of 11/7/98 
LiCFKCIYIKIIFSFLKQI>rrRSTIESSDSLCSRSFSQKLSVQTLJ?NLCESRLMKITSLVI 
AFLTL I VGG ALI ALAGGGVLSF PLGL ILGSVLVLFSS I YLVSCCKFFTLKEMTMTCSVKS 
•KINIWFEKQ RNKDI EKALENPDLFGENKRNVGNRS ARNQLEMI LHETDG 1 1 LKRYMKGAX 
MYFYL 

CPn_0165 206983 207582 

No robust homolog present in Genebank/EMBL as of 11/7/98 
NVLXF>INWPKTIDHVDPESEIDIPJ<WSCYK^ 

RSKYQEQARTVSDEDAPLFCLTRSYYQDGYLTPU^GPRIJLINHYIHLRRRENPKHFFSP 
KH PC YY ARLAFNESVC VYR ELFD I ERLTKMYVEGDYS KEQEKNLQA I LS FVKTLDEGKDF 
LI EHKDTDL IGRGFTDVFCT 

CPn_0166 207594 207962 

No robust homolog present in Genebank/EMBL as of 11/7/98 
NCLKQYNKSDSIMSES INRS IHLEASTPFFIKLTNLCESRLVKITSLVISLLALVGAGVT 
LWLFVAGILPLLPVLILEI ILITVLVLLFCLVLEPYLI EKPSKIKELPKVDELSWETD 



CBa^0167 208309 207977 

Nj-iobust homolog present in Genebank/EMBL as of 11/7/98 
NLfeHFPRGFFMLPFCPTILLJUCPFLNSEKYGLE 

IT^ELSAKDRKFKPGSKNCTLYTEDPILPAHNSFSNCSDIQMRTPISPIH 
Cpfeoi68 208716 208417 

Nq^ffobust homolog present in Genebank/EMBL as of 11/7/98 
Sf MlRRRENPEHFFNPGH PCYYARLAFNESVRIYRXIJTTrAELKQMYGAGDYEQQNEDN 
li{| Ils FVQ I LDEKDGFDDFLATHKDTTFIGRGGADIFCS 

C^!k0169 209537 208710 

Najeobust homolog present in Genebank/EMBL as of 11/7/98 
SF&lEFTIGENNMKNVGSECSQPLVMEIiJTQPIJ^ 
TAjT^AGILSFLPWLVLGIVLVVirALFLLFSYCT^ 
KD^EKATENPELFGENRAEDNNRSARSQVKETLiRDCDGNVL^ 
TMPDVDPVSEDSIRWISCYKLIKACKPEFRSLISELIAAMQSGLGLLSRCSRY^ 
VSHKDAPLFCPTHSYYRDGYLTPLRAGPRYI INRAI 

Cpjil0170 211098 210025 

Ncldsobust homolog present in Genebank/EMBL as of 11///98 
NVRKNH 1 1 RGEKYNTCTVI AFVLSMSYDTLFKNLEKEDSVHK ICNE I FALVBRLNT I A 
EAj'jTiKNL PKADI HVHLPGT ITPQLAW I LGVKNQ FLKWSYNSWTNHRLLS PKNPHKQYSh/ 
FRM^QDICHEKDPDLSVIXJYNILNYDFTISFDRVMAT^ 

LO|i}(§LDDT I VYTEVQQNI RLAHVLYPSLP EKHARMK FYQ I LYRASQT FSItHG ITLRFLNC 
FNK31FAP0I NTQEPAQEAVQWLQEVDSTF PGLFVG IQSAGSESAPGAC/KRLASGYRNAY 
DSlEa^EAHAGEGIETRTIFSSAKVNPEGLIEITRVTFSSLKFuXQPS/LPIRVTCQLG 

CPftybl71 212444 211149 

♦guaA-GMP Synthase 
IIKI^SARRHL.hH'IFILDFGSQYTYVLAKQVRKLFVYCEVLP^ 

SGGPHS VYENKAPHLDPE I YKLG I P I LAI CYGMQLMARDFGGTJf S PGVGEFGYTPI HLY P 
CELFKHIVDCESLITrEIRMSHRDHVTTIPEGFWIASTSCCS/SGIENTKQRLYGLQFHP 
EVSDSTPTG^ILETFVQEICSAPTLWNPLYIC<5DLVSKIQprVIEVFDEVAQSLDV0WL 
AQGT I YS DV I ES SRSGHAS EVI KSHH NVGGLPKNLKLKLV^PLRYLFKDEVR I LGEALGL 
53YLLDRHPFPGPGLTIRVIGEILPEYLAILRRADLIFIEELRKAKLYDKISQAFALFLP 
I KSVSVKGDCRSYGYT IALRAVESTDFMTGRWAYLPCDVLSSCSSRI INEI PEVSRWYD 
I3DKPPATIEWE 

CPn_0172 213237 212440 

*impD-Inosine 5 ' -monophosphase dehydrogenase {COOH-terminal 
region only) 

AP IGAA IG IGPLG ISRAHHLVEAGANVLVIDTAH^SKGVFQTVLEIKSQFPQISLWGN 
LVTAEAAVSLAE IGVDAVKVG IGPGS ICTTR I VSGVGYPQ I T A I TNVAKALKNS AVTV I A 
[XJRIRYSGDVVKALAAGAIX^LGSLI^TDEArcDIVSIDEKLFKRYRGMGSLGAMKC<3 
SADRYFGTOCQKKLVPGCVEGLVAYKGSVHDVLJf QI LGG I RSGMGYVGAETLKDLKTKAS 
FVR IT ESGRAESH I HN I YKVQPTLNY 

CPn_017 3 214041 2137A5 

No robust homolog present in Genebank/EMBL as of 11/7/98 
TIFDLIYKIDSYKHQQGFMDFSVFPDRFVBSTSPSPIEDIDAKTLVSNCCHYCSRCLFIF 
[..'JLLSr r IC FSVYGTSGETASLVFG I LSLfl VLVLLI I ECRNRECCRR IS 

';Pn_0l74 214215 /l4724 

No lobusr. homo ton present fi\ GeneUmk/EMBL as ot L 1/7/98 
Y I F LWFVKK I V I L3M I MTT I SNS PSBALNPEL^LI PPPTLVOTTQTJLAYT I PAQCRRS 
TLR [ ILIjIFI I ILGLATt ICTFIVJFFLNGLNLL^TP.'i [ IC'.'XLI IVGLLFLIMGLYFM 
[:j^L0(/JLVOLL0K.ELi;yAEEREEEYI0EIEALRGAPRAE^PTE;;PSTWL 

f :Pri_0l7 r J ±\4>v/u ;>1527 5 

No mbunr. homo ton preafknr in < ;t*neb.irik/ EMiiL .u; or. 11/7/98 
U-LACF'OFr.UWPCMEOPNOMQDTT^ 

f t p.r ;i .r/rr i: : v n .ekva r r i if* jyh pk fylsf r drdix/^vhyevldgvflktvaac i i ens 



FLTDSMS PELL3EVKEALK|^^^ 
CPn 017* 2l^m / 21*518 

SdIpm^SsafS 
ral^defrywekrldaknpo^ 




CPn 0177 /217513 216608 

No robust homolod present in Genebank/EMBL as of nn/ J t B mr , 
DKREQTKSKFIFLISfeMK0PMSLIFSSVCLOLGLGSLSSCWKPSW^m^NTSTSEEFF 
VHGNKSVSOLPHYPSAFRrTQIFSEEHNDPYWAKTDEESRK IWRE IHKNLKIKGSYI P I 
STYGSLMHPKSAA^LKTYRPHPIWINGYERSFTJIDTGKYLJCNGSRRRTSHDGPKNRAVL 
NLIKSSGRRCNA^LEMTEEDFVIARRREGVYSLYPVEVCSYPQGNPFVIAYAWIADESA 
CSKEVLPWGYY^VWESVSSSDSLNATCDSFAEDYLRSTFLANGTSILCVHESY^ 
QP 

CPn_0178 / ' 218052 217789 
No robust/homolog present in Genebank/EMBL as of 11/7/98 
VKEYl£>FL/QR>nraRDPC7rfa*HCTVSQKFG^^ . 
ES ILKQLI&LGI ITGYENREREVWVYLD 

CPn_0V79 218550 218056 

No robust homolog present in Genebank/EMBL as of 11/7/98 
PKIWOTHFETRIEATSWKFNRRLRKSFHKSGRSSRPSKACVANFFNFTLQAGRSG 1 1 PG 
KKAiLLNVNDAKTPNYSCIFESIGFFTIEQDLEAQHNG^ 
LPBSLKKDRKFMSSLI FTKLSYALDLSAPMHLEGKPNLSYEEKLD 

C/n_0180 218963 218355 

5fo robust homolog present in Genebank/EMBL as of 11/7/98 
TSUfKIUXTKYKPWIQhm/ASETYPSQILHAQREVlto^^ 
I CLLDVYHTNHYSVFTFCVDtlYPNIJlFTFVSSKNNEMN^ 
LAACKIRNIE^RVVGIIJLRSGILISKLELKQPQFQSLTEDFVNHSTNOEEARVHQKHVL 

LISLILLCKQAVLESFQEKKRSS 

CPn_0181 219175 218777 

No robust homolog present in Genebank/EMBL as of 11/7/98 

VHELFKIDGVYYFFKXFMKIJYNWSLNSHHEKPSSLEKAVQAJLD 

DD I SRE I YCVRRLY I RFWIVS I SQSLS RI PWRLKR I LLRYCTLRGKYVMP ILIKRIAILL 

GLIRFSRLRKSVY 

CPn_0182 220704 219334 

accC-Biotin Carboxylase 

RCIMKKVLIANRGEIAVRIIRACHDLGLSWAWSLADQEALHVLLADEAICIGEPQAAK 

sylkisniu^eitgadavhpgygflsenanfasicescgltfigpssesiakmgdkia 
akslakk i kcpvi pgseg 1 1 edes eglki aek igfp i v i kavaggggrg i ri vkekdefy 
rafsaaraeaeagfnnpnvyi ekfi enprhle iqvigothcjtyvhlgerdctiqrrrqkl 
ieetpspilnaeirvkvgkvavdlarsagyfsvgwefl,ldkdkkftfmehotriq 
iteevtgidlvkeoihvamgnklpwkqkniefsghiic<;rinaedptnnfspspgpxdyy 

LPPAGPSIRVDGACYSGYAIPPYYDSMIAKVIAKGKNREEAIAIMKRALKEFHIGGVQST 

ipfhqfmldnpkflesnydinyidnllaqgnsffkef 

CPn_0183 221207 220695 

accB-Biotin Carboxyl Carrier Protein 

rrlgmdlkqieki>iiamgrngmkrfaikreglelelerdtreg^qepvfyidsrlfsg 
qerp i ptdpkkdt iketttensetsttts sgdf i ss plvgtfygs papds ps fvkpgdiv 
s edt i vc i veamkvmnevkagmsgrvlevl itngdpvqfgskl fr i akdas 

CPn_0184 221814 221221 

efp-Elongation Factor P 

QWKIKFCCCEEK IMVLSSQLSVGMF I STKDGLYKVT SVS KV AG PKG ES F IKVALQAADSD 
W I ERNFKATQEVKEAQF ETRTLEYLYLEDES YL FLDLGNYEKLF I PQEI MKDNFLFLKA 
GVTVSA^IVYD^IVVFSVELPHFLELMVSKTDFPGDSLSLSGGVKKALLETG I EVMVP PFVE 
IGDVI K I CTRTCEYIQRV ■ 

CPn_0185 222457 221765 

rpe/arab-Ribulose-P Epimerase 

AEVKKQESVLVGPSIMGADLTCLGVEAKKLEQAGSDFIHIDIMDGHFVPNLTFGPGIIAA 
INRSTDLFLEVHAMI YNPFEFI ESFVRSGADR I IVHFEASEDIKELLSY IKKCGVQAGLA 
FSPDTSIEFLPSFLPFCD^AA^SWPGFTGOSFLP^^ , IEKIAFARHAIKTLGLKDSCLI 
EVDGGIDQQSAPLCRDAGADILVTASYLFEADSLAMEDKILLLRGENYGVK 

CPn_0186 222878 224068 

•similarity to Cps IncA 

P I KDK I LMSS PVNNT PS APN I P I PAPTTPG I PTTK PRSS F I EKV 1 1 VAKY I LFA I AATSG 
ALGT I LGLSGALTPG IG I ALLVI FFVSMVLLGLI LKDS I SGGEERRLREEVSRFTSENQR 
LTV ITTTLETEVKDLKAAKDQLTLEI EAFRNENGNLKTTAEDLEEOVSKLSEQLEALER I 
NQL I QANAGDAQE I S S ELKKL 1 3GWDS KWEQ I NTS I QALKVLLGQ EWVQ EAQTHVKAMQ 
EQrQALQAEILGMHNQSTALOKSVENLLVODOALTRWGELLESENKLSQACSALRQEIE 
KLAQHETSLQQR IDAMLAQEONLAEQVTALEKMKOEAQKAES EF I ACVRDRTFGRRETPP 
PTTPWEGDESQEEDEGGTPPVSQ PSS PVDRATGDGO 

CPn_01*7 224218 225045 

predicted methylase 

VFLTYTRTLPMHSKFLSRRKKNSSHKEET^WDCIA^oYrJKrVQDKGHYYURETILPOLLP 
SLTLGSKSSVLDrGCGOGFLERALPKECRYLGID [J.^PL IALAKKMRnVN^HQFKVADLS 
KRLF.FVEPTLFSHAVAILSLONMEFPIF.AIIWrATLLEPLGQFFlVL.NHPtrFRrPRASSW 
HYDENKKAISRH[DRYLSFMKIPIMAirP(;OKD. f ;P. f ;TLr;FHFPL.';YWFKEL^r;Mr,FLVSGL 
EEWT.'J.'JKT.'JTCKRAKAENlA.'RKEFPLFI M I :JC f K IK 

ohi_f)iH» z^mo :i.i(..i(]o 

t'Tl J2 hypothiit ifMl ptoroin 

kt[.m:x: imfrklfpf:;kkktgukqrlhnn< wajja i [y:; i kvu j iinka::kk.a<:vl;'yyoll 

Trvi'rLVFFLRLi;OIILFTNLNWKEW[.l I KKI-bYKKP I7A I VEA/xYIIATKSN If ILVLVCJ^F 

kvk:wa( : r lmlldlecx jlnk i fp.t:;wpp l :;i .kpi .v: ;yfv rTL,v.';i'M r k r i vt :< m i y itq 

IM[UOYAKLF:JLJHSMTALYF[^RKVPYr.LI/r[ja.[-'O.VAFI,PPVAtUK'[':;AI.I::TLr IG 
:;VWLVF0KAFF:;L^V.«; tFNYJFTYGALVALP.'JFr.LI.L.Y lYTM fYl.FaiAt.TF f [(JNRf ICT 
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F FFLGDK 1 LTGCYLQL ITSTY t LALTTROFNEGLSPLT^^BfQSKVP ICEVSQCLDVL 
EKECFLFrYNI^YQPVFNFSELT I KD I ADKLLHRE I FK^^BG ITF I ENG FQNI FTJQA 

SKNKENLTLJEIARRIK 

CPn.Ol^ 226391 229825 . 

CT131 homo log -(Possible Transmembrane Protein ) 

ANOMKRRSWLK I LG ICLGSS I VLCFL I FLPQLLSTESRKYLVFSL I HKESGLSCSAEELK 
^g^^^ rKLTC ^ KDEr/F ^ AE KFELDG3LLRLLIYKKPKGITLGGWSLKINEPAS 

t i .jii""'*'iH \ >\ *'<.'■'. J •<ii. < "■' '• r . :;•.!?';•• :tmkt7. •• ..:vr* .F.r::::F.KFwr»"7r< 
-rir''-! <• : "•!.•/•:: -v.. •!•'.::"/: vaffk : f r A:TfWii.i.::YKM inlijaeath 

IT SKQ I HATVSY AK I PLDITKWKH I E IT5QAQLP EVA I H 
SCSFTCLXHRTANVRLDISKiSCPEETO 

FIINDFKGSU^AN^DAKIEYDLKGSCIAPRQDSKTLAEFSLEG^VDHLFSPESREFKQT 
ANWIHIPSSFIAGIIPMSPGLKAQISSLAGPRIWSIKNAf^FGECPVDI^SD^AQ 
I PLI LNEKS ILLRENLTAHLS INEDVNKAFLOEFNPLLAGGAYSQYPVTLEIDKQNFYLP 
IRPYSFEEFRIQSATLDMGKISIA^mTTMYALFQFLDITDQKQFVESWFTPIFFSVQKGS 
IICKRLDALIDRRIRL^WGKTDIAHDRLFMTLGIDPET/IKKYFHOTSLKTKNFFLIKIR 
G S I S S P EVDWS S A Y AR I ALLKS YS LGN P FS S LADKLF SS LGDST PP PTVH PF PWEKSNFD 
SIENK 

CPn_0190 229901 231274 

No robust homolog present in Genebank/EMBL as of 11/7/98 
LLGIKLMRKRHSFDSTSTKKEAVSKAIQKI IKIMETTDPSLNVETPNAEI ES ILQEIKEI 
KOKLSKQAEDLGLLEKYCSQETLSNLEKTNASLKLS IGSVI EELASLKQLVEES IEESLG 
GODQLIQSVLIEISDKFLSSIGETLSG^DMNQWIQGLLIKENPEKSEAASVGYVQTLL 
EPLSKRIGFTHKKVATHDVNISSLQFHMMSVAGGRFRGHIDMNGYKVI^I^ 
SKDYLERWSSOLTIDKVEDKPITKPNKGKIXYSOTSPKLESPLPI^LLTSGISGFTWK 
S AS KSNDGS F PF SALRHKETES DTDC FQ ITSTTLSGNQAGTYTWSLSLKVLVPS I FQ I EK 
PE^QLSLVYSYEDWLPIDNIFTMSQPRTIPLALLGQTM^^ 
SPNCSRFSLQLKQTNQFENSPVDFYIVHAAHSCHWSGF 

CPn_0191 232039 231314 

gfn$-ABC Amino Acid Transporter ATPase 
QftlRFPVFLGYQKMXjWTIRVRNZAYSVNKraCIIXXj^ 
LRAlAGLVQPTQGDIWIEGEAPALVFOOPELFSH>fIVLGNCT 

FELXhLLDI EEVAKNY PDQLSGGQKQRVA IVRSLCMDKHTLLFDEPTSALDPFATAS FRH 
LL§?LRIXJELTVGLTTHDMQFVHSCLDRI^ 

"fa 

CBc^0192 232643 231984 

glnP-ABC Amino Acid Transporter Permease 
EYGVDHWtAI AJUXLRGCGYTLCVSG IG I ^ 

TVlRGTPLF IQILI IYFGLPEVLP I EPTPLVAG 1 I ALSMNSAAYLAENI RGG INSLS id 
WESftMVLGYKKYQIFWIIYPQ^ 
VSR&JPMEMYLICAGLYFLmTSFSCISRLSEKRRSYDN 

If! 

CPn^0193 233144 232686 

*argR-Arginine Repressor 

KL HGVFMKKKVT I DEAL K£ I LRLEGAATQEELCAKLLAQGFATTQSSVSRWLRKIQAV^ 
AdETOARYSLPSSTEKTTTRiiLVLSIRiWASLIVIRTVPGSASWIAALLDCGLK^ 
LAGPDT I FVTP I DEGRLPLLMVS I AMLLQVFLD 

CR?1jl0194 233162 234241 

gdp-O-Sialoglycoprotein Endopeptidase 
EV£HTI KGNVFFSWFFMLTLGLESSCDETACAIVNEDKQILANI IASQDIHASpGWPE 
LA^RAH LH I F PQ V I NKALOQANLL I EDMDL I AVTQT PGL IG S LSVG VH FGKG Jft IG AKKS ( 
LIQVNHVEAHLYAAY>L^AQNVQFPA 

FDteRFLGLPYPAGPLIEKJ^EGSEDSYPFSPAKVPNYDFSFSGLKTAVLYAIKGNNS 
Spf*S>APEI SLEKQRDIAASFQKAACTT I AQKLPTI IKEFSCRS ILIGGOTAINEYFRSA 
IQTlfcNLPWFPPAKLCSDNAAMIAGLGGENFQKNSSIPEIRICARY^^SPFSLASP 

CPn_0195 234172 235785 

oppA-Oligopeptide Binding Protein 
Y SGNS YMRK I SVG ICITILLSL S WLQGC KES S H S STS RGELAI N I Ju)EPRS LDPRQVRL 
LS EISLVKH I YEGLVQ ENNLSGN I E PALAEDY S LS S DGLTYT FKLKS AFWSNGDPLT AED 
F I ESWKQVATQEVSG I YAFALNP I KNVRK IQEGHLSIDH FGVHS ESTLWTLES PTSH 
FLKLLALPVFFPVHKSQRTLQSKSLPIASGAFYPKNIKQKOWrSLSKNPHYYNQSQVETK 
TITIHFI PDANTAAKLFNQGKLNWQGPPWGERI PQETLSNLQSKGHLHSFDVAGTSWLTF 
N I NKF P LNNMKL REALAS ALDK EALVST I FLG RAKT ADH LL Pff N I H S Y P EHQKQEMAQRQ 
AYAKKLFKEALEELQITAKDLEHLNLIFPVSSSASSLLVQL/REQWKESLGFAIPIVGKE 
FALLQADLSSCNFSLATGGWFADFADPMAFLT I FAY PSGV/PYAINHKDFLE I LQNIEQE 
QDHQKRSELVSQASLYLETFH 1 1 EPI YHDAFQFAMNKKLStJLGVS PTGWDFRYAKEN 

CPn_0196 235906 237519 

oppA-Ol igopept ide Binding Protein 
KLKSYSKERSFMLRFFAVFISTLWLITSGCSPSQSSlfclFVVNMKEMPRSLDPGKTRLIA 
DOTLWRHLYEGLVEEHSQNGEIKPALAESYTISEDGTRYTFKIKNILWSNGDPLTAQDFV 
SSWKEr LKEDASSVYLY AFLP I KNARAI FDDTESHENLGVRALDKRHLE IQLETPCAHFL 
HFLTLPIFFPVHETLRNYSTSFEEMPITCGAFRP/SLEKGLRLHLEKNPMYHNKSRVKLH 
KriVQFIGNANTAAILFKHKKLDWCGPPV^EPI/PEISASLHODDQLFSLPGASTTWLLF 
N [QKK PWNNAKLRKALSLA IDKDMLTKWYQGIWEPTDH tLH PRLYPGTYPERKRONER I 
LEAQQL FEEALDEL0MTREDLEKETLTFSTF3FS YG R ICQMLREQWKKVLK FT I PIVGQE 
F FT t QKNF LDJNY.';LTVNOWT.\AF I PPM J Y IM I FAN PGG I G PY1 1 LQDS H FQT LL I K ITQE 
MKKHLRNQLI I FALDYLEHCHILEPLOIPNLR TALf JKNIKNFNLFVRRTnDFRFIEKL 

cvnj) 1 n ;M75ij :Jahki 

f»p[>A 'tl ii)np. r [ji Mindirui p/if«: 

KMYKKKllt.DLKFIXIKK I KV I FKMFl'.RwtTLFLLF t;XT( X^'JUYnilKHKQl'.liI I P [HDDPV 
AFSl'KOAKI? AMDI^H AOLLFDtJLTiiw'IfREljNDLELAI A:JRYTV:iEDKC:jYTFFIKDSAL 
WSEXITP [T:U-'D [F^NAWEYAOENiJPiyfc f FQGLNF;"rP.'iSNA FT FULDSPNPDFPKLLAFP 
AI'AlKKPENPKIT- , :^[TTLVEYFP«filINIHLKKNPN , rYDYHCV. , J[N3lKLLI IPDIYTAIH 
M.NHf IKVrjWW IfjfWMO*' T PWEI il Il/o; : OYM YYTY PVEGAFWI iCLNTKS PH LNDLQNRHRLA 

t< : n jKH: : [ i kf.ai jyt ;too part lk *Rii apo pnq y k kok r'LT po lvlty pg d 1 lrcor r a 

!■! [ I .K K'.'WKAA' ; [ f)[ , [ LKi '.I .EYHU'VNKRKVQDYA I AT<JT' IVAYYPGANI. I .'-JF.F.DKLLQNF 
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CPn 0198 2-ifWT 2*Q'fa 

YFLTL I ABPVTSPVHHTLR E3YKKG/PPSTY I SNGPFVLKKH EHQNYL I LEKNPHYYDHE 

™cFm<RRrci^ 

Senl^dSieectpiipu^kyw^ 

CPn_0199 241018 241983 

oppB-Ol igopept ide a4rmease 

KCLIGLSLVFSYIKNRILF?HxSLWIVLTLTFLVMKTIPGDPFT^EGC^^V^ 
SRYGLDKPLYQQYTQYtis I AKLDFGNSLVYKDRKVTNI I ST AFP I SAI LGLQSLFLS IG 
GG IALGTI AALKKKKQRRY I LG AS I LQ I S I PAF I FATLLQYVFAVK I PLLP I ACWGSFTH 
TILPTLALAVTPMAFI^QLTYSSVSAALi4KDYVLLAYAKGLSPU<W 
SYSAFLTTWITGT^IENIFCIPGLGKWFICSIKQRDYPVAI^I^VFYGTLFMl^SLLS 
DLIQS 1 1 DPQ I RY^fGKEKKRK 

CPn.0200 / 241996 242868 
oppC-Ol igopept ide Permease 

EKKKHKLWENIKSAPSRS IWKS I IQNKMLVLGLTTLI ILMLGALLLPWFYQDYEQTSLKb 
I L VS PC SRF B'FGTDTLGRCMF ARTLRG LRLS LL I AT I AT L I DVCVG LLW ATVA I SGG KK I 
DFLMMRTTE4LFSLPRIPI I ILLLVI FHHGLLPLILAMTITGWIPISRI I YGQFLLLKNK 
PFVLSAKAKKASTFHILKKHLLPOTIAPIISTLIFTIPNAIYTF-AFISFLGLGIQPPQAS 
LGTLVKEJSINAIDYYPWLFFFPSLIMIALSISFNLIGEGAKTLCLEEGSHG 

CPn_02t)l 242810 243715 

odpDtOI igopept ide Transport ATPase 

AS I SWGLKHYVSKRDI^DNYLLNI KDLT IT STNPKRT LI ENLSLQLKENRNLALVGES 
GSGjOTITKAILGFLPENCLIKTGSILFEDIDITKLSPKEIiiKIRGQKIATILQNAMGSL 
TP^MRIGMQI IETLRQHHKMNKEEAYNKAMQLLTDVCI PNPKYSFSQYPFELSGGMRQRV 
V^IAIJ^QPKLILADEPTTALDSltSQAQVLRILRlIIO^KQATILLVTHNLSL\/KELCN 
4ciIK03KLIErrGTVKIFLSPKHPYTLKU^VSKIPIKKTSSPIIJC^ 



r CPn_0202 243682 244500 

oddF-OI igopept ide Transport ATPase 

VPTSNEYARWFMTTLLS I KDLSLT IRGKK I LNH I NLNL I KGSYLT I VG PSGSGKSS LALT 
ILDLIJ^PTTGTITFHMDPKIPRARKVQVIWQDIDSSLNPCMSIKGI ISEPLNIIGTYSKA 
EQNKEIYNVU)LVNLPKSVLHLKPYKI*SGGQKQRIAIAKALVSKPELLICDEPl^SLin , L 
r^SLILOLiFCTIKKEYQNTLIJITHDMSAAYYIACTIAVMDGGSLVEHACREKIFSTPKH 
TTTQDIXDA I P I FSL I STEMEPSEEYELQVASK 



CPn_0203 <«*»*3oo ^tjowi 

No robust homolog present in Genebank/EMBL as of 11/7/98 
IVPLPQKN^ErrSCMNTYTFSPTLQKSFSLFLLEKLDSYFFFGGTRTQILVITPTNIRLA 
AKKRGCKVSTIEKI IKILSFILLPLVI IAFILRYFLHKKFDKQFLC IPKVISNEDEALLG 
SRPQAVEKAVREISPAFFS IPRKYQLIRIDTPKDDAPSILFPIGI EI ILKDLCIDTLKQS 
NLFLKREMDFLGH PEEKALFDS ICS I EKDQEWMSLESKKLLITHFLKYLFVSG I EQLNPG 
FNPENGRGYFSE I STAKI HFHQHGRYGP I RSSGP IMKEI 

CPn_0204 245691 246002 

No robust homolog present in Genebank/EMBL as of 11/7/98 
PREWAWVFFRNKYSKDPFSSARSIWANPFFGTHHEGNIKIKGMGYQIFTRLKKLGISFSS 
YNS INPNPYFFDEGCFVYWESQFKSALQDHG ILQKQTETFYRNT 

CPn_0205 246073 246327 

No robust homolog present in Genebank/EMBL as of 11/7/98 
IEDSIKGYGSASAFRNPPQLLLKFFLVCEELCILTVATHRALLETPLALSFFKELKTKYV 
YRAKDI LQLHNYKGFTILNTSPLCS 

CPn_0206 246346 247161 

CT203 hypothetical protein 

IVDRRSPACYDS INSDAIGVSLLMDI SHILEDLAYDEG ILPREAI EAAIVKQMQITPYLL 
H ILHDATQRVPEIVNDGSYQGHLYAMYLLAQFRESRALPLI IKLFAFEDDTPHAI AGDVL 
TEDLPRILASVCNDDSLIKELIETPKINPYVKAAAISGLVTLVGAGKIPRDKVIRYFAEL 
LNYRLEKQPSFAWDNLI AG ICTLYPGELFYPI SKAFDGGLVDTSFI SMEDVENI IHEETV 
ESC I HTLCSSTEL INDTLEEMEKWLEDFP I EP 

CPn_0207 2J7208 248617 

ybhl/sodiTl-Oxoglutarate/Malate Trans locator 

VNKKKRFLSLLFLTAVLLG IWF5PHPAS INSNAWQLFAI FTTTIMGI IFQPVPMGAIAI I 
G I STLLLTOTLTLEQGLSGFHNP IAWLVFLSFS I AKG 1 1 KTGLGER IAYFFVSALGKSPL 
GLSYGLVITDFFLAPAIF^TAPAGGILYPWTSLSDSFGSSAEKGTQDLIGSFLIKVAY 
QSSV TTS AMFLTAMAGNFLVAALAGH VGVSLSWVLWAKAA 1 1 PGLPSLFLMP 1 1 LYKLY P 
PK ITGCEEA I RSAKLRLKEMGPLKKEEKT I LMI FFLLWLWTFGDLLG I SATTAALIGLS 
LL I LTN I LDWQKDVl ANTTAWETF IWFGALIMMASFLNQLG F I PLVGDS AAALVSGLSWK 
IG FPLLFLI YFYSHYLFASNTAH IGAM*^ P I FLAVS I S LGTNP I FAALTLAFASNLFGGLT 
HYG3GPAPLYFGSHLVTY0EWWRSGFAL3 1 VN I VIWIG IGSLWWKALCLI 

CPn_0208 250602 
pt kA -Fructose -6 -P Phosphotransferase 

SVAV I LMH PLYVDLDT I It SYS PPLPKEFQEAAS L I AVPDTS HSKP WPGVKTLFPQTYH 
LPYLKFVC^ ^ FJ^l^TPLK^wVT^FS(r,PAP<X:H^JVIOGLFNSLKDF^^PDSSLVCFVN^^ 
[^flTIKG IDITEEFLL>KFHN';^*iGFNC KITCRKKIVTPEAKEACLKTAEALDLDGLVI IGGD 
( ;:JriTATA ILAEYFAKRRrKT:" T VGVPKT ID< jDLOHTFLDLTFGFDTATKFYSS I ISNISR 
DALrJCKAMYItFIKIJyKlK>A;:iltALE(:AL(JTll[*MtALI''jEEIAEKNLPLKTI IHKICSVIA 
UR/^AMEKYYi',VTLI VF.C I I EF I PEI [HI. [TE E F.GL:;EYEDK [:;RL3PE::QRLLKSFPAPI 
I FJj fl ,NDRDAHGNVYV:'K : ;:VWLL I f II .V: :NI 1 WYFPrrVPFNA [ ;'HFUJYEGRSGLPTK 

fwity' ;y::lgycai ; t lvknu- ;: i' ;vr.; ;t i i:: :i j\cpfmkv/kl.ra [ cwKMPrvKOOADGTLQ 

I"'K IKKYLVD t(i:JTAKRKKKtiYI'K I WAi ,l\D; !YP KU II'F/J t F , 7I'!'I'KMH::t)NFPPLTLLLNHN 

F-woPii^iciEirrrrrY 



244966 245802 
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IILlJMTVK [LK7L3FPRSLLRTT."LWYRP 
<;pn_02lO 2524 r j? 251410 

No robust homo log present in Genebank/EMBL as of U/7/98 
YOKLWERERCTFKT [REKEHAT [3TMLVELEALKREFAHLKO0KPTSD0EITSLYQCLDH 
LEFVLLGLCODK FLKATEDEDVLFESQKA I DAWIALLTKARDVLGLGDIGA I YQT I EFLG 
A Y L5KVNRR AFC I AJ E I H FLKT A I RDLNAYY LLDFRWPLCK I EEFVDWSNDCVE I AKRKL 
TFEKETKELNESLLREEHAMEKCS tODLORKU'DI I IELHDVSLFCF3KTPSQEEYQKD 

■\r . ;'.f .7m!'Kv;:.v-.ai:i i: :i .r ■: :m[-"-:e:-tv" ; r • xptk 
CPn_0211 252765 252463 

No robust homolog present in Genebank/EMBL as of 11/7/98 
ECVM5YPD I SNVQASS IQSALLHKTS DQIQQKRCFKQSTFVI LAVSLVI ICSLFLLAGVA 
I LTVFSHGVLSLVFCVLG IVLGLLLLAOGVGLLVEEAKSLL 

CPn_0212 254081 252888 

No robust homolog present in Genebank/EMBL as of 11/7/98 
ELSYGVWS IYSEILSFS ELTSCKHSLFP FG P I ETAS I RIHHVFNWIVCLI I LGTLFVC , 
LGMVFLGVFSTYWjGMSSMILGLLLISIGIJUjLKF^ 

1 1 OMQDQ I ADLARELDL EQKKDTL I RGFSARLDVLEGSKTEKKQ I LKIGVPRNLSEIQER 
AQEQNS I LEQCKEALLFRRKS AQE I FKKL YDRKAAFWRS YREDLWCYS E I HVS KKALSNL 
YIGDVFEXin'APHFLMEAYAMCRTAKNLRNYVKVCVEDMRVNEEKKR^ 
EIETTDLENETNLFTSDSEDVLEEYQIHCIRVTMLHALWAIYNDEWSRKPIDTL^RVPJ^R 
MAVEDC IETFEELQMC WHTKTLELEI AQLYVDI LLEA 

CPn_0213 254345 254190 

No robust homolog present in Genebank/EMBL as of 11/7/98 
r LWFSRVI FSNTNQ IG I PRLEL I LPLWKKENDPFCFLFSRVEGTF 1 1 LN IK 

CPn_0214 255768 254446 

No robust homolog present in Genebank/EMBL as of 11/7/98 

FLGLKEDYERPTYCNIPPAPHPQRVDSKGCIASHVSTVVWAIJILGIFFI^GSLAFLVH 

TSCGVLLGAALPIUTIGLVIIAVALIVFI^HKHKTRODLDYYDODLDSLVIHKKEIPNDI 

SELRVTFEKI^NLFQFHTKDFSDLSQELG^KFI^O^EXWLTLEDEVTKFLIVRDRP , LETR 

RNFTTFGEQVKG IQSNI FDLHEEKSSLYLELYRLRKDLQVU^FLLPPGILKVDYDEI E 

AI KGLF I RLTSRLDKLDVKAQERKKF INEMSREFKEVEKAFDIVDRATKKUmRAKKESP 

ARLFMGRTESI^EMKKNEEAIJCNG£IXPENI£HPELFSPYG£IX^ 

LI SGTVTSGLTLEECENRMRAASTGLNALLVRKLQFRGAIKSAYFEKLTEIEKELRSLQD 

VIKSLELELIHKIKDIVTEET 

CPtg0215 257039 255759 

No^gobust homolog present in Genebank/EMBL as of 11/7/98 

LTpKKQVMSSAIAJ^FPSPSPQPSSTLGVHPPKYKSLIL 

V^£FSFSVLTVGU3GAGWLGSLLLILGLIFF^ 

U$*iLNEVQEWSNFLLDEWEDFKEWAQHKSQF 

DVA^LTELKNIWGPLEFLJIKKGDRU^EIDKLPJ<E^^ 

KIS^CYl^KJ^KVEKL 

TVFStEELQEALDKAKAEI^IQVRKSWEDLSCEPTLIQW 

TESSEQ EKVLEEYEALKARI RiTCLRVKI^VRANVAFVASTTDLLS ESESLDGNDSVFED 
AHDDFLD 

CPffif)216 257623 257174 

Noiiibbust homolog present in Genebank/EMBL as of 11/7/98 
NKARTMNPVT FDR I QVDF I PEDTSLR INSY IVAGGLL ILGWLS I LSVICLD IGLVGLS A 
GAAFTLGLGCL I FALFLFS FS L I LLLSQEKRVPDVLSLYLEKEVPQ YET PLYKEDLESER / 
DK§ AI S ERLG 1 1 EEKLR I AEKFRYSDSVFV 

r~ 

CPn^0217 257881 258579 

yp&P* 

PKCCKLKG FLSVNEL IFGFQTF SVWLGVF F AS RGKAWLTGWL S LLSSI MNV FVLKOf H L 
WFEVTSADVYVIGLLTCLNYAREHYEKNDINDAMLCSWVIS IAFLVLTQLHLFLI /SPN 
DSSQEHFLALFSSTPR IWASLVTLI FVQI VDI KLFTFLQRVFSKKYFAMRSTI SLLFSQ 
LI £qi I FSFLGLYGLVSNLCDVMI FAMLVKG IVITLAI PTLTVTKAVLDRRSS 

a LI 

CPriJp218 259064 258582 

No*£#bust homolog present, in Genebank/EMBL as of 11 ///98 

ifl'Skkvffesyedfanvasswpkslralvcoryfvdselk^ 
rslpi istiggi irlieahsgpihprdkmkyrfevlqavieilglgvlilvtdi igcfla 

FLVAI I LSLLLYCNSTFTCVQNLSFTERMLEGIGEAVNFLA 

CPn_0219 259348 260472 

tgt-Queuine tRNA Ribosyl Transferase 

gsslalkfhlihqskksqarvgqietshgvidtpap/pvathgalkg/idhsdipllfcn 
tyhlllh pg peavaklgglhqfmgrqap I ITDSGGFQ I FSLAYGSVACE I KSCGKKKGMS 

slvkitdegawfksyrdgrklflspelsvqaqkdlgadi i i pldel/pfhtdoeyfltsc 
srtyvwekrsleyhrkdprhqsmygvihggldpeqrrigvrfvedZpfdgsaiogslgrn 
lq ems ewk itts flsk er pvh llg igdlps i yamvg fg i ds fd^s y ptkaarhgl i ls k 

AG P I K I GQQKYSQDSS T I D PSC SCLTC LSC I S RAYLRHLFKVREiPNAA I WAS I HNLHHMQ 
QVMKEIREAILKDEI 

CPn_0220 260660 261236 . 

No robust homolog present in Genebank/EMBL as oi 11/7/98 
FYSFLKKKC I FYMSKES I RSYSE ISTPTP I FRETPSKECV/yKLQLRSPAKDC I LRNRVS 
LKG ALL R SI P F YG5 F LGAK R I H S AWS AKDA PCTTRVYHY JwGG L ELLGLGVWLAC KVLA 
TALKFLFSKASSKIKQMKWREKARNLAAKDTVCSIKEFOSVDLTSCFTRCFRLRNRWEE 
GASENQTVREIIV 

CPn_022l 261621 26201-1 

Ni j tubus t homoloq present in Geneb.ir/:/EMDL us of L 1/7/^8 
DAI.RYKYF.IGrQMVNRYKS^AEFSADHYYDDNLVp/^YKRNLRGL^PVENEVCLFEENNL 

[..E:;vMA:;[p[Mr.:;iLci/;Rui5vw.7rQDPKDSKi/i [fhtalciletu'lci ivllikit 

IT I LL [ LFTPCLLCYFMY^AAYSDFHP I 

(*l , n_02^H ,it,2474 ili.L'Xl/ 

'wo.ik Lint Lit Lty to Bacr.*r iophufc "HPl (Oit4) 

i;RKF[,KrWEKLRKU^AFEI/rfJPEEYRNRW^rr;y^PT(:RTQHAKVW::YR^VHEA.GLYE 
KtJ(:FLTfV[TnnKIU J njYC:;LVKr 1 |(LOLf-X^LRKMi:;PHKlRYFF.i;^AYCTKLORPHYHL 



Mo robu'Jf. homfjlou P^^H i n '^neb-mk/EMBL as oi 11/7/98 
: 7TML ICRYSSDDQFTEATK^^R K LJFVRDNLEGLTNP I S E I VSET3SS I KDSVLRSL 
P I LG5 I LOC AR L Y.'TTL.'TTN D PlTeTQ Eljf I WT (T I FCALETLCLG I L I LLFKI I FVILHC I F 
HLVICFCK 
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CPn_0224 263402 / 263674 

No robust homolog presentf in Genebank/EMBL as of 11/7/98 
YTFKNPKWiKKMKPNS I IFLENTI^YFDr FRECFVRDRHGLMEASDWLLSTEITIIRSIL 

i:Pn_0-Jj lini/jc Jo4^4i 

No robust homolog present in Genebank/EMBL as of 11/7/98 
NSFTIKFLLMTKNAINSenTPQPNLTDAEPIASRAOCKS IAVI ISLFALGMLLLCLGI I 
LI SI PI PGLAAQVALGLCIVSLILG I ALANIGFLCLLLRCKQVPOKPDTLPSESSKOPSE 
GSTPTALPWOAGEFLEKVCvSATPILLPKNKDEELSAKVMKEGAEAASSIKQAVLESTEK 
LI DARKOEESRREARKK I#AEEAEAS RKR IQOQMAADOEALRKRKEEVAKRK 

CPn_0226 / 264545 264967 

No robust homolog present in Genebank/EMBL as of 11/7/98 
AI FNRKRMPYYANTLEF IQGTQSLCPLFKYGFVRHHYKGOLE IEDASHDWDFLEPPSTWK 
RTLLAA I P I LGSVISLGRLFS IWS I REPQDSQEYKS I FWHTLCAVLEI LGLG I VAL ILKI 
LATF IMAMPGLKRVATFLFYS 

CPn_0227 / .265467 265009 
dsbB-Disulf/de bond Oxidoreductase 

KERFNI FVSCHLLKEIMMINF IRSYALYFAWAI SCAGTL I S I FYSY I LNVEPC I LCYYQR 
I C L F PLTV I LG I S AYREDS S I KLY I L PQ AVLG LGISIYQVFLQEI PGMQ LDICGRVSCST 
KI FLFSYVM PMASWAFGAIVCLLVLTKKYRG 

CPn_0228/ 266242 265412 

dsbG-Disulf ide Bond Chape rone 

vkdradf lnlkekfsc s i lkkenafe fyvfc s i kqltns slrg plnkk i lvlctamf f i v 
cfgflihkkht i lppkah i ptnaxhf pt ignpyap initvfeepsc s acaefttewpll 
kkhym3tge isftlipvcfi rgs kpaaqallc i yhhdprqad i daymeyfhr i ltypkee 
gshwatpevltklaeglkinsgrsvnpkgleoc i asgqyneq i kknnl ygsqvlggolat 
pta/vgdyliedptfheieraiqh irqlqavegdhdd 

CBh_0229 266163 267560 

T178 hypothetical protein 

|rf.qKAF.qFr.RTF.QF>JF^FKFKKqAr..qFTYTrrANTTK.qTFTFTrjJlJ.R^rXX^^RFMnKFT 
NIYRHFRYKFLKI^ILPAPI^I JJ J^SPNTLNYTQVDVIFSDRI^TSCLLIFLAIASLT 
f KRSLLWLGAPLGIWVTLFACVAGRSPTIFANDTLIGFAILAVVCI sptrpealevgptlp 
EGFSYNPSAGGRRAAVLFLSIXGWLEAJIYLTASSLGITSSQSSNFLI^YSSIMTVYSLLV 
VLSLAGSEKRWHTRPKIVIATALALTGVI ILTLLPI I LHQLRYDCWLCLCLT I EPALAW 

FAYDETRATLRYI sqflgdkraltrasffgseyykhtlsweertvlplrkaykoafeg I s 

FPINQU^ILVATVFVKVNSSMGLPTFPRNFI^ICCW^ 
L FSAAILFS PVLFH I PVESPMFLP I IVTGLIL 1 1 LS IGKRRRTKRKL 

CPn_0230 268277 267576 

^CT179 hypothetical protein 
RFKKAL IYMSSQPLVTTSSSLSRYVVLTGEEKVACYKKAFNHIWHGAPAI ILAAALLMFC 
IFGFVLGSILLGAPLEGASILYDVILPWLLPSILVFVIXVLPLNIYAYSHHKQVLALHER 
ITQSNYKE I YDH C EKEKKT PNKKALS LY I ESQVLVP EY S KRF S SM I LGKTLK 1 1 PKKDS P 
ES LKHDEL I QKALERAKEN I YMhn^QREKRDEREAKKEAKNAS KTN PLWEGLGT 

CPn_0231 268996 268253 

tauB-ABC Transport ATPase (Nitrate/Fe) 

PQAFVSIQDRGFSMLQAHPOrYSCDNQVILKDASFQASPGTITIILGSSGVGKTTLFRLL 

AGFLPLQEGELLWNGSPLrniKDVAYMTOKEALLPWRTALKWOTLSTELGIN^ 

RLEE 1 1 HNFDLGQLLDRYPDELSGGQRQR IALAAQCLSLKP I LLLDEPFSSLDVLLKEQL 

YQDIVAI^KENKTV1,LVTHDFHDVSCLGDVLWIKNKTLTPVPLDPSMRPLN^LCFIK 

DLKKHLYT 

CPn_0232 270134 269232 

•similarity to 5 ' -Methylthioadenosine/S-Adenosylhomocysteine 
Nucleosidase 

KKFLMRRFLFLILSSLPLVAFSADNFTILEEKQSPLSRVSI IFALPGVTPVSFDGNCPI P 
WFSHSKKTLEGQRIYYSGDSFGKYFWSALWPNTCVSSAWACNMILKHRVDLILIIGSCY 
SRSQDS RFGSVLVSKG Y INYDADVRPFFERFE I PDI KKSVFATSEVHREAILRGGEEF I S 
THKQEIEELLKTHGYLKSTTKTEHTLMEGLVATGESFAMSRNYFLSLOKLYPEIHGFDSV 
SGAVSQVCYEYSIPCLGVNILLPHPLESRSNEDWKHLQSEASKIYMDTLLKSVLKELCSS 
H 

CPn_0233 270439 270248 

No robust homolog present in Genebank/EMBL as of 11/7/98 
EKARTMFLGKVLLFLLRISRRSYVQE IGI FFHLETPDLKIVLCAFVSTFIWEMDVSLKN 
KGQS 

CPn_0234 271246 270548 

CT181 hypothetical protein 

FIML03CKKALLSIWSILAFHPIPGMGVEAKSGFLGKVKGWFSKKEIQEEARILPVKDS 
LSWKRY DYTSSSGFSVEFPGEPDHSGQ I VEVPQS E IT IRYDTYVTETH PDNTVYWSVWE 
YPEKVD I SRPELNLQEGFSGMM0ALPE3QVLFMQARQ IQGHKALEFWI VCEDVYFRGML I 
SVNHTLYQVFMVYKNKNPQALDKEYEAFSOSFK ITKIREPRTI PSSVKKKVSL 

CPn_0235 271395 272177 

kdsB-deoxyoctulonosic Acid Synthetase 

VFVRYLLMKPEE3ECLC rGVLPARWNS."RYPCKPLAKIHGK3LIQRTYENASQSSLLDKI 
WATDDQHI IDHVTDFGGYAVMT.^PTC.-rJCTERTGEVARKYFPKAEI IVNIOGDEPCLNS 
EWDALVQKLRSS PEAELVTPVALTTDREE I LTEKKVKCyFD.GECRALYFSRSP I PF ILK 
KATPVrLHIGVYAFKREALFRYLQHr::;TPLSDAEDLEOLRFLEHGGKIHVCtVDAKSPSV 
DYPEDIAKVI^CY ITCL:;NAYF 

F^yt'I -'.TP :'Jyrir.huL.iuo 

:JHTrYHMPFKvMFL'rGf;Wr:r;LGKGLTAA.';i,ALILERORLfA/AMLKLD[ > YLNVDPf:TMNP 
K F.I (( 1 E I YVT DD( i V ETDLDLCH Y H R F.' i:'J-ALi ' U I in." ATW>J t Y ARV T K R E R FO DY LGHTVO 
V Nit TTNE [ COV ILDAAKEIir;POVL 1 7K [f yrr fGDIEflLPFLEA I ROFRYDHCEDCLNIH 
MTYVPYLO/VADEVK. r JKPr0H:;V0TI.I"; \ ( \ f I PliAILCR.'jEKPLTOEVKriKIIjLFCNVPNR 
AVKWtDVKHTEYEMPrWLAOEKlAflFf^FKI.KI.ATVPEflLDDWKVLVNOI^OLlLPKVKl 

' :w( ikyvohrdayk:: [ FEALTHAALP.r/ jiiaak r r p i da f. l Er j ltm e u"^ r dac \ l v Pf jc i fg 
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VRGWECK IAAAKFCREQC I PYFG rCLGMCVL WEY Al 
VMEXXJOPLVATGCrrMRLCAYPCLLKPCSKAHKAYNES 
IICLR rVCTCPPOGLCE [ EEVCDHPWH trWQFHPEFVSKl 
:iHV 

273741 274214 



iss^^ki 

skiSWTpi 



AN3LEMDPNTPHPIVY 
IRHRYEVNPDY IQSLED 
PLFIAFI EAALVYSKDA 



LEVMKFEFSVALKYLIPC;R<^^BrVGLF.//Gr :3L'^VWLJ t VFrSVrHCLEGRWIEDL 
SQUiSPrTILPSDTYY3or^MfHS3L^rrrTKTLCEKIA5POVDPYDPESDYl.LPET 
FPLKDCDLGCC^KDPvXMTLEsEuPYLQSbHGKVIEFEOG^ 
THFLTYPSKLSYEDKVLPYDETDYTSAEyNPFNRSPSCWOQDFHHLEELYRGAS 1 1 LPST 
YKDSGYKVCDTX7VTSTYSIENEKETQY*EW/IGFTN 

SECLGMSNCFHLFFPNTKR IVFVKKQ l£N t LTSLGVDDYWE I SSLHDYDYFQPILDQLQS 

DQVLFLFVC r lilivacsnivtmsml/vnnkkke ici lkamgtssrslki IFACCGAFSG 

ACCWtCTIFAIITLK^FIVKAl^[^RETFNTAFFCONLPNSVHPQArYFLGLGTL 

r ,aav"~a: .papkva fw r/" r : r ; -" y 

CPn_02bO 2b»jUo/ ^o'^.'iJ 

rl33-L33 Ribosomal Protyein 

KDSSMASKNREI IKLKSSESSDWYWTVKNKRKTTGRLELKKYDRKUlRHVire 

CPn_025i 286036 287559 

•conserved hypothetical protein 

SPDSCLPWMSPFKKIV>mU^rSFOKESRTLPIIIREPFWfrTKSLGSFNSVISKNKIHF 
I S LGCS RNL VDS EVMLG I DLKAGYESTNE I ED ADYL I LNTCAFLKS ARD EAKDYLDHL I D 
VKKENAK 1 1 VTGCHTSNHRDELKPWMSH I HYLLCSGDVEN I LSAI ESRESGEK I SAKSY I 
EMGEVPRQLSTPKHYAYtKVAEGCRKRCAFCI I PS IKGKLRSKPLDQILKEFRILVNKSV 
KE I ILIAQDlJGDYGKDI^DRSSOLESLLHELLKEPGDYWLRMLYLYPDEVSDGI I DLMQ 
SNPKIXPYVDIPLQH^RILKWRTTSREQILGFLEKLRAKV'PQVYIRSSVIVGFPGE 
TQEEFQElADFIGECWIDNLGIFLYSQEAirrPAAELPDQIPEKVXESRIJCILSOIOKRNV 
DKHNQKL IG EK I EAy I DNYH PETNLLLTARFYGQAP EVDPC 1 1 VNEAKLVS HFGERC FIE 
ITGTAGYDLVGRVykKSQNQALLKTSKA 

CPn_0252 / 288112 287576 

CT144 hypothetical protein (frame-shift with 0253?) 

ATSTVCALWILQTYQSHDDAASCSFRRACRFGRYWLGGVWPWNKFNCT 

Y I DSSCTWMMRFQASAS I PRLFRIS I FMTKHGDWI DNGTGGELLLVAYEANQNPLF PD I R 

IEIJ^TCSOTSYYFJU*PM(>/LCSTYYAV 

CPn_0253/ 288474 287950 

CT144 hypothetical protein (frame-shift with 0253?) 

FCGGRLMSSS I PTTQKITI S I PTFVRFN I ES INLTDEQKKTALT IGWIATENTQVLGNF 

VDADGGilCQNDLSVGGNINITPQTFNTMVFSGRV^NSPFSY^ 

EQ PCCYVPYGYYKLTRVMMMQRAALSGGHVGSG DIGK3ESMYLG I S S I KRQHKVQ 

CPnj0254 289268 288459. 

CTlii3 hypothetical protein 

I PhnCTLGVKDQNLFIDQATLSVERNvTUEN^ PCEF I VXGNVSAEGS 

QI/NATTLSDGFN I YSKTDVSQTPVCNNI SDPQSARDALT FSYYRKTGCQAANLYTYY PGN 
YVAPNTT I ETHVAA ITSKSVSRNATPDFSR YAD I EPWKLKQVG I YQVTMQ LTRWSGQ 
GDNSATLIIJ^FVSGNNXTIXCTSDTRGGYSSDRTSVAVTAIFSVTELVSSPPYDYPWI 
LE^IWW4I>lSl^TCVrWFPFPSNFVEVD 

=>0255 290183 289329 

CT142 hypothetical protein 
TLLWIMKNNINNNECYFKU3STVDC^ 

VSATCLTSGTTYNUIAQNFTSSOISI DFKNNRLSNC ALP KEDCDPVPANYVRS PEYFFCS 
PLIGDFDFNSGESYLPLTGSEYTLYQSRNVNSIFTIFIGVWOSTRELTVGGNTAIQFLAA 
TY IVSFTVGKRWGWNNGWGGA IY INNGLGQVQC EST I YSGGGYAT IGTLGTS I YKASVD 
r VAPNPNDPNASDRYRAGIFYLSNGGSSAGIGNYSFSLLYYPDDRG 

CPn_0256 291282 290398 

CT144 hypothetical protein 
FCGGRIJISNPTPKTKISIPTFVRFNIQSI^TEDQ^ 

DGGLTCQSDLTIQKDINI RPTSTNSMVFDGRLNLSNS PLSYKNSQGQDI TDYEKMSSGKP 
QEYVPFGYY1WTQIMMA0RAAHSSGYVGGGSVPSGSYVPWNKFDQTSTQKTSGTEIYIDP 
OTSTKLVTEVNNKVPKLFRISVIMAKHGSwlIJN^ 

TTSRGSSYYETRPLQVVCVTYYAQNNGYFTFONRAGGGLRVSFFSWNIVALPYVE 

CPn_0257 292136 291267 

CT143 hypothetical protein 

GVVMKRRM^KILPNASTPSTNVAE^IKDQNLFLDQATLNVDGNV^ 
ADTITSPCEFTVGGGLSAESSQFKATTLSKGLEITSEDQDGRVPKFTNVSDPQSPRDALT 
YNYYRM'GCQALNLYTYYSSSQPTTVGKP I ETVCONPNPETYR I SASAK I YDAVTR FPY I 
QF KAPG I YQ VT IQ I RRESGQHSGLDN PNL YLNLM IGNNKTLLC AS DTRG YSGGKRT S I A V 
TGTFTLTEIVATPPHDYPWLFLETTIGLDIKSMSTCVIWFPFOANFAEVD 

CPn_0258 292534 292133 

CT142 hypothetical protein (frame-shift with 0259?) 

CFSFCRLGSKFEK I TLGGNT A I Q LLAAGT Y I LTFT IGKRWGWNNGWGGS IRLFEGKYTGD 

GTMLCGSTVYSGGGYSTIGYLSTAVYRDHSDrDPDPNNPSDKYMNNFLFVRNGDHSAVIG 

NY S FTLL Y F AG DK V 

CPn_0259 293031 292441 

CT142 hypothetical protein (frame-shift with 0259?) 

E YFVFKRKTYNYF I EMTTTNNQDNNECYFKLDSTVDGDLLASNIOTFDKQAKG ISSTETF 

SVQCNATFKEKVSATGLTSASTYKLNATGPAPSS IT I DMKNNRLSNPALPKNPCDPVPAN 

YVRS PQY FFCAKP I EGTFMFDGSSRYLP ITGDGSNYTLYQSSKACDVFRFVDWDQNSKKL 

HLCGTQPYNFLLQEPIS 

CPn_0260 294090 293548 

secA-Protein Translocase Subunit 

AY LDFSKRSCVEEDHVSKK I NRNDLC PCGSNKK YKQCC LKKE EQTARYTTECKFKFSAEV 
LS AS EQG EAG DNCTKLFQR LSQS LT S EQK AAVG KFHQ I T KNK EVMS KKALKKAQAK EEKL 

VTEKLQOHNFEIUTTGENLAPPMESTATLNCDTNFVCEDFrPTOEDFRISENSOKPPVEE 
D 

OPn_02M 2'.»427;! 295013 

ydaO-PP-Loop Super Mm i ly ATFvjse 

YnFMHPFrVFM:;TLLLNPPWMKA'';KR [EriLVRKALYTHTMLJVNUUK t WAI.JfXKDSLTL 
LLMLKAICaRGFr'OLDLlIAVNr^KY.TOAI^VNKPYLTH [CDOt-C [ PFRTTPSPYAPETP 
ECYrx::;fJARRRLLFOAv\KE[nA:;A[AFY;ilHRDDLVOTA[.UILLIIKAKFA(^MLPVI.DMVMF 
^"VTILnPL[FTPEFWrRKFAKl-:rx:FARVr('RClW:[.R:;KAKO:"I.KI.[.KEVFPLAKI!NtA 

LAtfjEnc:;r;K^OKi 

I.IFNINKEVKVYLVLWNKRLK I [LTNW/I fTAW W.CLV.^ALLFAN U ID I Y [ AAPQAEO:: 



r?Pn_0237 
yqgF Family 

7. rLRMO-AMSKPSSCKAYLGrDYGKKR rOLAYAAEPLLLTLPIGNIEAGKNLKLSAEALHK 

; :;..:;.■!! :■: /; , irn-UMvi' :;. v. :;.;,ak::: xk:.. "r/i-:r : :.wr,FH..7:"V0AERH 
;.km<- ;-Kf^Tr::i,AAri,f[;r.-:-:.;-::LFrP';.:". 

CPn_0238 274210 275838 

zwf-Glucose-6-P Dehyrogenase 

PCNHQKLRDFNFRNFLLFVIFASAGTKKEIKWnJWQETIGGLNSPRTCPPCILVIFGAT 

GDLTARKLLPALYHLTKEGRLSDOFVC^FARREKSNELFRQEMKQAVIQFSPSELDIKV 

WEDFC^PXFYHRSEFD^CrrrSLKDSLEDLDKTYGTRGNRLPfLSTPPQYFSRIIENLN 

KHKLFYKNQDOGKPWSRVIIEKPFGRDLOSAXQLC^CINENLNENSVYHIDHYICKETVQ 

N I LTTR F ANT I F ESCWN SQ Y I DHVQ ISLSETIGIGS RGN FF E KSGMLRDMVQNHMMQLLC 

LLTMEPPTT FDADE I RK EK I K I LQR I S P FSEGSS IVRGQYGPGTVQGVSVLGYREEENVD 

KDSRVETYVALKTVINNPRWLGVPFYLRAGKRLAKKSTDISIIFTQtSPYNLFAAErc 

PIENDLLI I R IOPDEGVALKFNCKVPCTNNIVRPVKMDFRYDSYFCTTTPEAYERIXCDC 

IIGDRTLFTGGDEVMASWKLFTPVLEEWDQDSSPSFPNYPAGSSGPKEADALIERDGRSW 

RPL 

CPn_0239 275863 276672 

devB-Glucose-6-P Dehyrogenase (DevB family) 

KS I SHTNIG I ETMATL INFNDTNKLLLTKQPS LF I DLASKDWI ASANQA IKQRGAFYVAL 
SGGKTPLE I YKD IVINKDKL IDPSK I FLFWGDERLAP ITSSESNYGQAMS I LRDLNI PDE 
Q I FRMETENPDGAKKYQEL IENKI PDASFDM IMLGLGEDGHTLSLFSNTSALEEENDLW 
FNSVPHLETERMTLTFPCVHKG KHVWYVQG ENKKP I LKSVF FS EGREEKLYPI ERVGRD 
RS PLFWI I S PESYDI ADFDN I SS IYKMD I L 

CPn_0240 277861 276698 

No robust homolog present in Genebank/EMBL as of 11/7/98 
LVYFMVFSPSSESVVKANSWRS^CYFLENKFVSPSESTEVMFSEIMKGRVPDIESLFD 
RPTDMMMTGFKAAQNLGNLFTJSFGILIMCFSCCKSC^ 
LGPTLGALVYCAYKVYTLGKMIYSLhnCAKAKVLRHPAQNW 

KL YKS AM IG SLWS L I AS LAL I ALT AG IVLVLFFVAPGAAPVITAAMMGCCAAGGGALL I 

SLLGLWIAIVRKAKHQEACVGHLTNWLHTAVSEALIJiDPSHFC^^ 

YGHLFS NEEVAQ L VQGG APGGG SR P SQHYGG S S DYQNRRGGNGNFGG S H FGGGGG FAGS H 

FG&GYPTAPTMPSAPPPFPPPAYOTIYG 

CPeQo241 279372 278203 

No ETObust homolog present in Genebank/EMBL as of 11/7/98 
I F^WFttSAM I SLSSSHEAS IASNTQVRI^VSLA^ 

TAJ iPDLKDWETEG EHHFQVYSNI S LKMI YQRFFEK I FG IGCC PLLLVTDSHHTDPCGA 
L ITGT FAAVLFTVLAI VFG PTLGI LC YSAYKI YQLTKK ISSLSRTHTEVINSVQKSDPF I 
HR^VAAAAASQST I KACKVFRQSTLI FFVLGLI IT ISLAALIVGLVFALFFLDPGAPA 
VMTAAM IGCCAAGGTG IU.SVIGFLLASVYSVQKSQEGVHHMHTALLRC IVSNTI IQMPY 
LP | EPGTKKVLTQS I RRYQQ F FS DDEYRD I ES EVPLNRQTT P PPSY ETLFHEEGSDGSSN 
Vl|BLESPPAYSTIDSSNSPFPSSSPPPYYR 

CPiJ~fl242 279975 279487 

No»rebust homolog present in Genebank/EMBL as of 11/7/98 
KSLKYCSLYQFSQKPTVILMACSIFFRMSCXjDYDDEPI^KKTACLVVDTMLYPVIAVVCA 
WSVVLLILKVLFLLLSFPFKLCSASSALPGERVSLGSHFKCLYGGGLPYLLACLLIVPV 
IG|M HGF 1 1 SHRTS EDARLSSA I VFMQAP I LQLAGMSGL IKP 

CP4 e £)243 280609 280133 

No;robust homolog present in Genebank/EMBL as of 11/7/98 

in^^lvfllkfvkgriimacsigyhixnanepdrfvaskvalvadillypfmavicaw/ 
favlmvvkllflaikflvntciaacksrplpsckf^jfcclfgpkdkpgpsdwix^lvli 
i iphi ysti itvqsdtnrlryfi ispayqvgstai inw 

CPi^S244 280906 281556 

adkrAdenylate Kinase 

GAfflSjPTKGSVF I IMGPPGSGKGTQSQYLANRIGLPH I STGDLLRAI IREGTPNGLKAKAY 
LDKGAFVPSDFVWEI LKEKLQSQACSKGC I IDGFPRTLDQAHLLDSFLMDVHSNYTVIFL 
EISEDEILKRVCSRFLCPSCSRIYNTSCXSHTECPDCHVPLIRRSDDTPEIIKERW-KYQE 
RTAPVIjAYYDSLGKLCRVSSENKEDLVFEDILKCIYK 

CPn_0245 281627 282499 

ydhO-Polysaccharide Hydrolase- Invasin Repeat Family, 
TCQKEIMKHYLSFSPSADFFSKGGAIETOVLFGER'/LVKGSTCYAYSQLFHKELLWXPYP 
GHSFRSTLVPCTPEFH IHPNVSWSVDAFLDPWGI PLPFGTLLHVNSQNTyi FPKDILNH 
MNT IWCSGTPQCDPRHLRRLNYNFFAELL I KDADLLLNFPYVWGGRSVHHBLEKPGVDCS 
GF IN I LYOAOCYNVPRNAA DQY ADCHWI SS FENLPSGGL I FLY PKE EKM SH VMLKQDS S 
TLrHASGGGKKVEYFILEQDGKFLDSTYLFFRNNQRGRAFFGIPRKRK/FL 

CPn_0246 282955 282551 

rs9-S9 Ribosomal Protein 
WAKSTIOESVATGRRKOAVSSVRLRPGSGKIDVNGKSFEDYFPL^IORTTILSPLKKIT 
EDQSQYDLI I RVSGGG IQGQVIATRLGLARALLKEWEENRQDLKSCGFLTRDPRKKERKK 
YGHKKARKSFOFSKR 

CPn_0247 283430 ^S2969 

rll3-LL3 Ribor;omal Procein 
D.*;Y I I M EK R K DT KTT I VK3 S ETTK SWYW D AAGKTLG R LS s£.Vt\K I LRC K H K VT YT PHV A 
MGIX IV T V r N AEK VR LTGAK KGQK I YRYYTGY [ SGMP.E I PFpJMMAR KPNY 1 1 EH AT KGMM 
l-HTKUIKKyLKSLRIVKGDSYETFESOKPCLLDr 

".TuJIlMH ::h445'J JSlh50 

y-'t V/yH>A Afif .' Tr.inuport«ir ATr.iK^ 

i<:;t'r<rn,i\:iw<YAT[\;r<:;FRr:pACKK:;RKN^^ 

mN t : : I f .T(jV:;r.: :i JIAf ;KT IT< JA^MNilKTTf.LHLL/TLDVP.S.SGnLRFFDKDLKNODLA 
NFKrMI I' M'VFONKYt .I.EDDTVLKNVtWFAL rARK/lJJK'^PVYTRALELLDLVNLEDKV 
HTRC: :K1 .; If ;r ;i :K«.mVA r ARAL r NEPA I LLADFPliOf ildeet:jeoihnllleoasalcc I L 
[ vtunk m i .a: : i" ; n w ;v i -.w ; k lf f i in: ' 
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- ;K:;MA [::L.NOWCASPYAypOPVKEAWAVGG3PTDCV^^fctl.FE3V3PDLVrSGrNCG 
TiN [CKNAWY.XT tCAAKQALVDC I P3MALSQDNH ISF^^MPEILKALVI YLLSQPFP 

wmuiiUF rr:; pocs swegmr lvp pgdeffy eepoy loWRnqyyvc k i scvr iceh p 

HEELACMLENH I5V3P rF.^QNSP IGLMTLEEFQKTOENFNASLLSSELTTKIF 



FTEPOAL/XILELRLY^LTn^^BCKEY'FELLNKIAYYKOVLSDEGLVKDirRNEUJCL 
LKHHKVAflRTTIEFDADDr^^K ITNEJVI ITtSCDDYVKRMPVKVFKEQRRGGHGVT 
GFDMKKGAGFLKAV^AFTKCra I FTNFOQCYWLKVWQLPEGERRAKGKPI INFLECIR 
PGEELAAILNI KNFDNAGFLFLATKRGWKKVSLDAFSNPRKKG LRALE I DEGDEL IAAC 
H I VS D E EKVMLFTH LCMA7R F P H EKVR PMC RT ARGV RGV S LKNE ZDKWSCQ I VT ENQS V 
LIVCDOGFCKR5LVEDFRET^JRGCVGVRSIL^^rcR^^G^^/IJGA^pWDHDSILLMSSQGQA 
IRIWQDWVMCRSTOOVFLVHLKEEDALVSMEKLSS^ 



CPn_0263 296174 297136 

yqEtl hypothetical protein 

"TALSRRKLRVRPPGrjAKYAFP.GFRMSHGPRPTKFSFPLYFSKTLSWFILGGFLAACGVQ 

:-:v:.vi -f !■•■;. n :\*:i:,\/Air-ww,\i;r\->:Lw;^^^ 

; • \ L^i j^wi/ ;M:;E'FVrr^::!:Mirr/vr/ ^ ;a r i^^ri'ir.i r pik if ;;tdote I [/; 
[ r I NK K KC YTVGQ 1 1 L FVN F F I F AL3G I VY KNWHT AFVS F LT YG I ATKVMDMV I LGLEDT 
KSVT I ITS S PRKLGH I LMETLG IGLTY I HAEGGYSGEPRNLLYVVVERLQLSQLKEIVHR 
EDPSAFIAIENLHEVINGRRT 

CPn_0264 297730 297155 

ubiD- Phenyl aery late Decarboxylase 

MKRYWG I SGASGVILAVKLI KELVNAKHQVEVI I S PSGRKTLYYELGCQSFDALFSEEN 
LEY I HTHS I Q A I ES SLASG SC P VEAT 1 1 1 PC SMTTVAA I S IG LADNLLRRVADVALKERR 
PL I LVPRETPLHTIHLENLLKLSKSGATI FPPMPMWYFKPQSVEDLENALVGKILAYLNI 
PSDLTKQWSNPE 

CPn_0265 298632 . 297730 

ubiA-Benzoate Oct aphenyl transferase 

KI I IVRLNYFLNLVNFKYSIFS rLFLSASTVFALSINEISQNLSFKEGFKISVFGAIAFV 
FARTTG IWNQC IDRF IDKKNTRTSKRVLPANLVSLNFAWVLSLFCSFLFULCKILRI F 
SLGIASLTL^IWPYllKRVTFFCHV^IXJLVYTVAILiMNFCAFA 

SVGMV I AANDI I YAI EDTEFDREEGLRSVPAHYGEKKAVEI AJCVNLWVSYLAYI FSGFVG 

SLDKEFYFTAIIPLWIZJCVVfcKTSNYSKKDQEGESKFFLJWI^ 

R 

CPn_0266 299181 299876 

No robust homolog present in Genebank/EMBL as of 11/7/98 
IMAXJ3EINNQ^PSO^IASSTSQTSKINQDRKTFACTVT 

LGLSVPLSGILGTFA'vTVGAVLFITGLTILVRKSLGIEQKNEDLNFl^IKTFT'PPARPLM 
SKFSVTCSTTS I VLGMALL IGAWSVFFLTGYLOLGIXAGLVGLGTAIJ^AGLARMSPRS 
LADQEGSGSADSQSNIVGIGEPKAAQEQKWYKMAVVRGEDGIPTAIRLTPEK 

' CPn_0267 300122 300910 

No robust homolog present in Genebank/EMBL as of 11/7/98 
V^ .IMS LNKTNALLNQP EP AVCLNAWDPKY I NQDRKT FACTVTLLVI ATLM I LTTGVI VLL 
AMCjPGLSVLVSTI IGTSVTTLGTALF I IGLVKLIKKSLAWIQYQKYFQEWKQKYEPFS 
IPKNDWHKLTSCLPSPLDIESPSPEASTPVSKIJIIACSGVAIVLGVTLLIGAWSVFFC 
TGYIsQLALCVGFACLGTALFVGGLAGLRTHSLIAQG IMYLYLTYYLSSALEERNETVKDQ 
RNE3NTYLTEECRQQKREKALLE 

CE|ve0268 300914 301318 

NtfTfbbust homolog present in Genebank/EMBL as of 11/7/98 
KQ^SI^SQCQSSSTSTWEWMKSFVPNWKNPTPPLSPIPSEDEFILAYEPFVLPKTDPE 
. NA£ANPPGTSTPNVENGIDDI^PLLGQPNEQ 
QEBfiQGSONNEDL IG 

CpfiQu269 302468 301476 

Dipeptidase 

VAHfcVMT I DMHCDLLSHPHFCRKDPAVRCSP EQLLSGGVRQQVCA I FVPHSRGEPNCDK 
QNSLFFSLPNQYPDIGLLSYEEEE^SSSQKKSLSLIRSIEWASALGDDTAPLGTLLAKL 
IHLTKQGPLAYIXJIVWKGDNRFGGGTEAPKRLSITCKVL^ 

ED|^YTATKLPMJ\VIASHSNFRSVLI>HRRN^ 

LGDLEKKVLHAENLG I LSS I VLGSDFFYANEDENFF FNECSS AEAHPVLNQLIHRI FSKG 
KAEglLSSRAEKFLKQVIVEQVNPKITDVKL 

CE>y0270 303343 302468 

ywJc-SuAS Super family -related Protein 

S IFSVI VPDKKAQ ITFSLPEVMSAIHQGKIVALPTDTVYGFVLSLYASEAEERLYALKDR 
EPSKAFALYVNS I EDI EN I SGYPLSPTAKKLAOLFPGAITLWKHRNPRFPKETLAFRIV 

dh5yyreivdhcgtligtsanlsefpsal7aqeifadfadhdlcifdgpcshglestwa 
sdpfsyiyreglisrsvieniagteakifhrtshafskhikiytvknqeqlvsflsgsldf 
kgWehpkpknfytrlrealkkktpsivfiydintsdypelfpflspyyie 

CPn_0271 303628 304362 

Lysophospho lipase esterase 

KLMTDYSFFRRKIGNI EAI ECPGNPQDPI I ILCHGYGSLADNLTFFPSICSFSKLRPTWI 
FPNGILPLE^roFRGSRACFPL^^/LIXQELSPXYA^lGVGNLQEKYDELFDVDLETPKEALE 
EL I LNLNRP YNE I I IGG FSQGA I LATHLVLTSQN PYAGAL I FAGARLFNQGWEEGLKQCA 
.QVPFLQSHGYEDEI LPYHLGAHLNDLLLTKLNGQFVSFHGGHEI PSWFQKMQVTVPNWI 
DPARG 



CPn_0272 



305272 304340 



dnaX-DNA Pol III Gamma and Tau 
FNRQSDAT Y ATWVMH L EEENQGWEALLRKVYHQEVP PA I LLHGFTLPVLODKAEQLASE I i 
LLSSSPGSEHKVSQKIHPDIYQFFPEGKGRLHSIDLPRGIKKQIYISPFEANYKIYIIHE/ 
ADRMTLAA I SAFLKVF EEPPKHAVI I LTTAKVQRLPKTI ISRSLS I F I ERGEK ILCSKE 
F 5 Y LF R Y AQC E I PVTEVSQ 1 1 KESS ETDKQVLRDKVQRFMEVLLELYR D R YT LNLG LK ' 
ALNYPEHWEILOLPLLPLDKVLLrVESACRSLNNSSSAA^VLEWVAIOLVSLOYKEHEL 
VSVSPGQDLSN / 

OPn_027} 305853 305227 

r.dk-Thymidylar.e Kinase 
' JG [ VF I V I EGGEG.XK5SLAKALGDQLVA0DRKVLLTREPGGCL IGERLRDL I l/EPPHLE 
LSRCCELFLFLG-jPAQHIOEVI IPALRDGYIVICERFHDGT IVYQG IAEGLGAOFVADLC 

iikwgptpflpnfvllldipadiglqrkhrokvfdkfekkplsyhnriregfZslasadp 
:'rylv[,dare:;la';lidkvmlhtolglct 

f:f-n_0274 JOyjfiR 305R52 

-JY'A OKA t'lyr.i:;.; Subunit A 
K::ti [<MFNKDE[ IVPKNLEEEMKEiJYLRYSM.WI [SPALPDIRKJLKp/QRRVLYAMKQL 
:;[.:;tfJAKHRKCAK [<;(J DT:.'GDY H PI IG ES V I Y PTLVPMAQNWAMR Y P Ly DGOGN FGS [ DG D 
I'l'/XAMRYTFAKLTIIwAMYLMEDLDKDTVDIVP^DETKHEPWFPJiBFPNLLCNGSSG [A 
Vf ;MATfl I PPHNLGKLI E^TLLLL^PUASVDEILOVMPGPDFPTGGyf ICGSEGIRSAYTT 
GRMK IKVPAHLUVEENEDKHREni riTEMPYNVNK.^PLI EOrANLWEKTLAGISDVROE 

: ; uk u ; r u wl e t k Kf ; fa: e 1 1 1 nr ly k r rDvovT fc amm laldkwl prtms [ hrm i s aw i 

HHHKEV f P P KTR Y FLNKAETRAH VLEGYLKALSCLDALVKT r PE/JCNKEHAK ER I IESFC 
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FMDPKEKNYLAJA ir/LEGwAVRERHJMYIGUTG nXJL«!LVYEVVLii3 IDEAMAG Y Jo 
RIDVRI LEDGG IVIVDNCRG IPIEVH ERESAJCQGREVSALEVVLTVLHACGK FDKDSYKV 
SGGLHGVCVSCVNALS EKL VATVFKD KKC YQM E FS RG I PVT P LQYVSVS DRQGT E I VFY P 
DPKIFSTCTFORS ILMKRLRELAFLNRG ITrVFEDDRWSFDKVTFFYECGIQSFVSYLN 
QNKESLFSEPIYICGTRVGDDGEIEFEAALCW>JSGYSjELWSYAfWIPTRO^X3THLTGFS 
TALTRV I NTY I KAHNLAKNNKLALTGED I REG LTAWL SVKVPNPQFEGOTKQKLGNSDVS 
SVAQQWGEALT I FFEENPOIARMIVDKVFVAAQa/eAAKKARELTLRXSALDSARLPGK 
LIDCLEKDPEKCEMYr/EGDSAGGSAKO^RDRRFQttlLPlRGKILNVEKARLQKIFONOE 
IGTI IAALGCG IGADNFNLSKLRYRR 1 1 IMTDADvDGSH I RT LLX.T FFYRHMTAL I ENEC 
VYIAQPPLYKVSKKKDreYILSEKEMDSYLLMLcTNESSILFKSTERELRGEALESFINV 
ILDVESF INTLEKKAI PFS EFLEMYKEG IG YPZYYLAPATGMQGGRYLYSDEEKEEALAQ 
EETHKFK 1 1 ELY KVAVFVD IQNQLKEYGLD 1 3SYL I PQKNE I VIGNEDS PSCNYSCYTL E 
EV INYLKNLGRKG I E I QRYKGLGEMNADQL^DTTMNPEQRT LIHVS LKDAVEADH I FTML 
MGEEVP PRREF I ESHALS I R INNLD I 

CPn_0276 311140 3/0793 

CT191 hypothetical protein 
DMFLKRKKRGGSQVQNKRTASP IKHAIO^YLHNYLQELQKIMAARPHDAI DAWNQVFRDKY 
KGMSQAIGFRDH ILLVKVYNSSLYALpCQTPQNDL IMSLYQVASHVQIRE IQFLLG 

CPn_0277 312003 / 311404 

No robust homolog presen/ in Genebank/EMBL as of 11/7/98 
NISIFYPKYFIEGKEVLIKNLPPEIFYGVILMIINVPAPAfXJITSVQOFSTNFQAAIPIL 
NI VIGCSRI SSTYAEDI EEVAOStCLEKSTHSKSSTSVNLWAHRVRGWE ILGGG IV I LAL 
EITALVLOVIIKLIKCLIDVLgWCLFGLGVCWAIIGAIAFCVVVVV^ 
PIEVKTLISPDKPYPTWYV j 

CPn_0278 31C884 312060 

•conserved outer me/tbrane lipoprotein 

RDSMKKKLS LLVGL I FVLfiSCHKEDAQNK IRIVASPT PHAEt L.ESLQEEAKDLG IKLKIL 
PVDDYR I PNRIJjI^KQVTaANYFQHQ AFL.DDEC ERYDC KG EL W I AKVHLEPQ A I YS KKHS 
SLERLKSQKKLTIAI pyDRTNAQRALHLLEECGL IVCKG PANLNMT AKDVCGKENRS INI 
LEVSAPLLVGSLPDVDMVIPGNFAIAANLSPKXDSUTLEDLSVSKYTNLWIRSEDVGS 
PKMIKLQKLFQSPSVpHFFDTKYHGNILTMTQDNG 

CPn_0279 / 313546 312875 

* Possib4e ABC Transporter Permease Protein 

KKIMQSpLI^IxK£TVOTLYMVSTAFFFSCAIC^MLGU3LFCrSPKSLNPKKSLYATIS 
MI LSFLf AI PfrXlLIVILFPITRWIVGTSLGPTASI VPLT IGAI PFWT IWDAFRNSAL 
NYLESAVAKaPKRKI LFG I LLPESY PQL I FS LKSLWHL I SCSTLAGFVGGGGLGQLLL 
QYGYYRREWS^SViyiTLVLIESVRILGDFVWRR 

CPn_028wT / 314593 313550 
dppF-Didfiptide/Transporter ATPase 

IKGEAWLVSEQHS/l I SVQDVSKKLGDH I LLSKVSFSVYPGEVFG IVGHSGSGKTTLLRC 
LDFU»^S^^AGFDNSLPTQKFSRRNFSKKVAYISQNYGLFSSKT^ 
HHSMKSE^QVYDTLNFI^LYHRHDAYPGhJLSGGQKQKVAIARAIVCQPEVVLCDEI 
TSALffPKSTEl& IERLLQLNQERG ITLVLVSHEIDWKK ICSHVLVMHCGAVEELCTTEE 
LFL^ENSITNELFHEDINIAALSSCYFAEDREEVLRLNFSKELAIQGIISKVIQTGLVS 
IN^SGNrNLFRKSP^FLIIVXEGEVEQRKKAKELLIEUJWIKEFY 

Cj4_0281 315033 316103 

-Predicted 1.6 -Fructose Biphosphate Aldolase {dehydrin 
family) " 

f I SLRRHTLMLNI HD I I^NDDF.NLLSYOCKHITKDKLTLPSHDFVDKVFGLSDRNNRVLRS 
LOTMFS HGRLANSGYLS ILPVDQG I EHS AGASFAIN P I YFDPEN I VKLA I ESGCS AVAST 
YGTLSLl^RKYAHKIPFMLKLWHNELLSYPTKYHQIFFTQVEAAYSMGAVAVGATVYFGS 
ETSNEEIVAVSNAFAXARSlJGLATVLWCYLRNPAi^ANG'/DYHTAADLTGQADHLGATLG 
ADIVKQKLPTCC<WFKAINFGKTDERVYSELSSNHPIDLCRYOVLNSYCGKVGLINSGGP 
SGKNDFTEAARTAVINKRAGGMGLILGRKAFORPLSEGIQLLNLVQDIYLDPNITIA 

CPn_0282 316084 317529 

xasA/gadC-Amino Acid Transporter 

ILILQSLJ^FSKKVFMHSHSKPTKPLGTFTVGMLSLAVVISLRNLPLTAKHGLSTLFFYGL 
AVICFMIPYALISAELASFKPQGIYIWARDALGKWWGFFAIWMOWFHNMTWYPAVLAFIA 
ST I VYK INPELAHNK\*Y I ATVI LAGFWI LTFFNFLG ITS3ALFSSICV I IGTL I PGVILV 
SLALFWI FSGNP I A ISLSWGNLLPNFSNVSSLVLLAGMLLALCCLEANANLASDMVNPRK 
NYPKAVFIGAIATLTILVLGSLSIAIVIPKEEISLVSGL'/KTFTLFFDKYNLSWMTGIW 
VMT I AGSLGELNAWMFAGTKGLFISTQNDCLPRLFKKV: I3KNVPTNLMLFOG I WT I FTL 
LFLCLDSADLVYW I LTALSVOMYLAHY ICLFLAGP ILR I KEPRAQRLYSVPGKFLC ICTM 
SILGILSCAFALWVSFLPPRELAQISEGSKIGYTTFLLLAFSLNCLIPFGIYFTHKRLSK 
KS 

CPn_0283 31S581 317532 

No robust homoloa present in Genebank/EMBL as of 1L/7/98 
GRRL3YFQDLIKNAVAKIISFRKSPPNPVKLLIKFAKKGLKNSSIAPLYEVLLEILEAPG 
EEILEVLFSLDPMWLKSMLDPKKH3TLGIEISSETAETIESCSLGLIS INLLLSGLCLRS 
S f I DR 30 A VK 1 1 QO FC FQ FS S E E VQN F VEQ RN I LT P F LH H L F EC D EV AL LNQ L.TLRLDLIV 
PNALYPEPDrsCW0:*INSEEX'AKDAED0QEDFNKTKE/\CKEGLKKLVLPAL3 TTS I POLL 
RARRFKQGAETLMA [PRKKMKQNPF rFLEALLE.'JEEF" r."'/nKYLKLl-MriULWDKLLHA 
[YLGYF9TGLIC0 1 >Ft ETFi'RRANtJ^r'EAFOAA rOO/'PLLGFLFPKMLLD 

CPn_fj;iH4 il l M) l j4 UHSSl 

No trA.n-M ti(»i[io!.M pi.^-.ont in f>:n<;Ujnk/ KMLiL ,u: m Lt/V/'ih* 

i-'LijMri r pafvvpv f in- rwNNT:;r;Y^;i^LK:::;LRp [tyi. f la t la i atlm:;vi.yr:g r £ 
v< vitv] / ;ml i pl; i V* ■ vi a'vay t .fyoo^.i i KKTKVFf: it::p.",vff: ;of.pi .ni.i .u ;Rt:r-:D;; 

V::AlDEL[.KNFPA[J|iKliRPKMI.r'Y:;HFI.DEyf;RPNF.:;F'KEDnHT;:KII. 
a-nJJ^lS <.:i)t.ui H'MiM 

Ni> rohncr homo I, m pi.-i-.i-m in < L.-nr.-Mnk/KMP.!, .is <u ii/7/-»m 



85 



AAPLCLLVWCCAASVCSMMA [VSLMCLYKGCKPL r EP^^kpPTKDLE I KDPESLKPV 
PVEGQSLPK ERKTVSFKAK I PS I VEDDFKPYVIQSTF^^M^SKP IAERMQSLEKEIT 
TLIVDFPRALEESSKSSGSLLRCVISEIKNLFLPRFLo^^PrSLTACLP.RLGSIVEEYA 
SSDLLI LLLTK PEPLNMVTQQL I AHLNSLKTEKRKLTPHMQKLVLS INFWFYGWSLEEKC 
TEKIVAYDPNLLTDELKAHLEAGNIVQFLLSFOSSEMQREFRALFPSDAOELPSAKCGSN 
YVPAINSSEYMYDFKDLSVLKKSL3ERLAFCEKIPSPSSWNFTSSVASHYKDFSLLFTFF 
SNQQSV I LQN P F LL I E LLH EN P KCQT FLKGLL EKAM PMSNWAALFR PMLMCM LCSG I AR K 
KELKI TAEHLCVPFKETTQAIASCKILDLLLQHLFDF 

mgtE-Mg+* Transporter (CBS Domain) ^ 

SCRESKGKIMVGEQNRNEEKLDTAFSSGNLMDSRTSHLDDELSFKLEKAFTCLSTDIHSH 
DLSK IVIEYNP I DLAYAVSCLPSESRAILYKNLSC ITAKVAF 1 1 NTDSASRWAI FRRLS D 
SEVCAL I EQMPPDEAVWVLDDI PDRRYRR I LEL I DSKKALKI RDLQKHGRNTAGRLMTNE 
FFAFLMETTVKDVSACIRSNPGIDLTRLVF^DFKGELQGVVTDRSLIINPPEMSLKQIM 
NO I EHKVLPDATREEWDLVERYKI AALPWDEENFL IGA ITYEDWEA I ED IADETIAR 
MAGTTEDVGYQTCHWQRFL1JIAPWLLVTLFAGLISASVMAYFQKISPALLALIIFFIPL 
INGMSGNVGVQC ST I LVRSMATGTLS FGRRRET I FKEMS IGLLTGWLG ILCGLWYLMG 
FLGLNI FSGGG I QLGVTVATGVLGASLTATTLGVLS PFFFAKLGVD PALASGP I VTALND 
IMSMI IFFLIAGGINFLFFN 

CPn_0287 324230 322089 

No robust homolog present in Genebank/EMBL as of 11/7/98 
RRCM I RS PL PF I SS KRALNMLGLQDEFSC PEDWDFLFS E I ELLAQQDEPS EGYLALSRS 
LLMMTHNH P KWKRV I FYGVSYGLKHKSMS I F I DVLTY I DFLFEKLG I SASDRLSLCSAR 
TCINFELYSCTGEMKFLSEVVDNFRLIEQLLKMHPQLKNRLGWEHFRIGAKQEEVSLV^ 
ASVYQAVGRS F I ELYHKHL ELSDLACGMKCLAXALDLSPNNAHI HADYAKGLWLGTRQG 
KSLLIERGMEHFSKAIFLSFSRDGDTLAYQNYRYSYAIASVKLFDLTYKKEHFDQAMNIL 
YQTVQAF PNLSGLWMVWGELLI RSGWLNSNMKY I EVGLEKLASLQKKTNDP I ALSGLLAT 
GIAILGLYLEEPNLFKDSRHRLISAMRTFPGNSALVHALGWQLCSALYFNEDSHFASAI 
SCFQSCLEWDUDATGMV^KLFDAYFSWIKKXSARLLRK^ 

RGLALKCLAEAT IDEAYKEIFLSESLLHYQRAWDI^GRLEILEI^SHYUAELQQSLF 
HYDEAYTI^TKVDLTLSSSRVKLILAAVLLGKGRIXQDTDPAE£AR£ILEPLVEVYL£DE 
NFLUJXKVYLFLFWKNKNVCI^KLARTY^ KDVNKAWG 
MVIRSAQYGVRITEAKWUTOPYLAm-REIHAFREVVENQKGRLWLGl^ 

CPn_0288 325785 324571 

CT288 hypothetical protein 

IS IT I REFLFFGFECRAKFYNVIMSCFNLTSTNESLRP ISPKASFPKQGWQSYFRSALRK 
HRSDTLSVSVCKVNKYDANLFVRLTVIALAVVGVLILFS IMLAS I QGTLV I T SWPLVTAA 
l££^ILLTGGMYILHRLGKKVDVISGVCIPPFSRRCVA^ISSSHTLEKFDEKHVSACSY 
LDISTLSADGSGI AAVYQC PPLLFRAFPCPG I PCAMPFVALLRMIYNLI RFLWPFY 1 1 F 
RM*YEHFFCKHLPEDDRFIYKDVAREM3RSIAAFIJ(APFYA^^ 
LMGSVERDWNDWILARSVSLANEAHSLFRJ'EGGGGRXGUKJHAFYLML^ 
KGEI VSGAHPS IQLPERRGLDTSGRYPH ISVI PDSGNDSAKNFIV 
111 

CPX.0289 325797 326996 

CfJl9 hypothetical protein 

NFMLMKKQRSHYKKNNLLLLLS I LVGLGLGSVQSPWI VYSAEC IANTFLKFLRLLS I PL 
VBSAI£STITSIQNFNTMVTLGKRILYYTLLTW 

TK^PU3YLDVLSDTLPE2^IFKPFLQGWISAACLAVLLGTASLFIX2EKE^ 

F^FLNIJ^GGLKI^PIAMLGFSVILFKEUCDQSNL^ 

IJil^INKVSPI^KVAKAMSPALVTAFFSKSSAATLPLTMEIJ^DDLiCINKNLSR^ 

V IMMNGC AAF I L I TVL FVATSNGM I ISPLMSLGWIFIATLAAIGNAGVPMGCYFLTLSLL 

TSMNVPLS I LGL I LPFYTVI DM I ETSLNVWSDCCWSLAN 

C^0290 327027 328523 

Na>dependent Transporter 

RSA^TMNKKHASFSSRLGFIFSMIGIAVGAGNIWRFPRVAAQNGGGAFLILWUTFLFLWSJ 
IfiLI 1 1 ELS IGKLTKKAP IGAL I KTAGKKFAWAGGF I TLVTTC I LAYYST IVGWGLSYF 
YAVSGKIHLGNDFAKLWTSHYQSS I PLWAHLTSLGLAYLVIRKG IVHGI EKCNKIL I P^ 
FLCTI ALLLRAVTLPGAVQG IKOLFSCDKSCFSNYXWI EALTONAWDTGAGWGLLLyYA 
GI^^SKKTGWSNGALTA ICNNLVSL I MGIII FSTCASLD ILGTTQLQDGAGASS IG DTF I 
YLP^FTRLPGGIYLTTLFSSIFFLAFSMAALSSMISMLFLLSQTLAEFGIKPYISETLA 
Tt=!MFVU3IPSALSLTFFSNQDTVWVALIVNGLIFIYAALVYGFPKIJ(KEVINA^DL 
RL^FDYIIKYIXPIEGILIJ^GWYFYEGLFPENGQWWNPI^^ 
WKFWKQLYLRFSRYNHEIL 

CPn_0291 328658 329194 

incB- Inclusion Membrane Protein B 
EKHMSA PIPTPQELS DQ ITCLNVQYQQVS ELARENKGD I EGLKTLTAALTAD' AG IQ PS AD 
ElYSUJTAAALILSASEKPGSGPSGSTEGSWVQSPCKFKKVIAVVLTIIJCLIAIAVLIA 
CIIAACGGFPLLLSAI^LYTIGACVSLPIIASTSVALICLCTFVANSLI^PVITVRTTR 

CPn_0292 329201 329836 

incC- Inclusion Membrane Protein C 
VKNTKf JSDFMTS P I PFQSSGDASFLAEQPQQLPSTSESQLVTOLLT^MKHTQALSETVLQ 
OQRDRLPTAS 1 1 LQVCGAPTGGAGAPFQPGPADDHHH P I PP PWPAQI ETE ITTI RSELQ 
LMRSTLQQSTKGARTGVLWTAI LMT I SLLAI 1 1 1 1 LAVLGFTGVLPQVALLMCGETNLI 
WAMVSGSIICFIALICTLGLILTNKNTPLPAS 

CPn_0293 329940 332723 

CT234 hypothetical protein 
VWSMQRVLRLLFNLHHGEEKRAFLFFLLCLVWG IGCYCTLStfAECLF IEKLCSAELPK I Y 
LGCSLILCVLSSLILYNLFKKHISATALFLIPVSLSILCNB7LILSSIFAIDPPRSPLFF 
YR-I VIWSLT I LS YTSFWGFVDQFFNLQDGKRH FCI FNAI BFLCDAIGSG I IASLVHT IG I 
QC I L I LFT AALVLTF P I VFYVS KS LKS LSDDH DLF I DTGH PP PLS KALKLCFYDKYTFY L 
LCFYFLMQLLAIATEFNYLKIFEIQFASKEEFELVAH IG5KCSLWISLGNMCFALFAYSRI 
VK RLG VNN 1 1 L F APLC F LS L F LFWT F KTT LS I AVLAMWR EGVTY ALDDNN LOLL I YGVP 
NK I RJ%> I R I WESF I EV IGMLVWC L t CFLtfSQQYVFQL 1 1 SL I AT I LVCLVRSYYAKAI L 
KNLnAOAL^LTRSMODW[KSMTVKOKROVELFLLAHZKHPSERH0TFAF0HLLNLASR3V 
LPSLLAIIMNKLSLPNKLKTrEMVKSSLWAKDFLTLELLKRWTSrFPHPAIASAIHLYFAE 
IIDLLII TTH [AEDLYDTVGDRLLAA [LTVRROEAYOPYRDLADKRLKELLNSCOPEDIVMG 
LT [ LKLEKNPQNFP I LLDFLNTKNED t LIVTCKALl IT^VRANUKPYCrELLKP-LROCoHN 
DEAIJOYLLKTI:; IALD I:;r-VKDLLMTT::OLKNT/RKYA[-:AM k'KLDKEVAPAFLQVLTDE 
f THNRCft [LAAKALCK IDNWLLKKHAYKIVKlJUW^KALFYnYMi'HY IOKKYPTYNLIjLLA 
NTI JJ:;HYYAKVNFMl.:lLtAJ I LGSMEHSGVLIBALTSKNOK [KAOALESLEKT K.'DSHLFSL 
LKI'PyH0l<;Mf:Y«EKYYFKaWIPLTLKELl/MMENflP:;:JLNKL.TAQ0LKEEL3YCDPDF 
O^Wf r YNQKHEDKR'PEE^ETL ISKLS [ 
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_ 14 
CAMP -Dependent Prot^^^Bnaso rteau l.ir.ory :;ubuniL 

t RNFFMNL [ DRAFLLKKT T^^HpMDLLItT [ ADKT ET T I FK PG3NVFS rCOPGFSFYII 
VECY IT rSKEKLE5PLNLKPLl!cFGEESL/F^KPREYNA::ANT0VRMLV LCKGQ r LNIVE 
ECP5VALSFLELYAKQIKFREP 

CPn_0295 33386S /333S27 

acpP-Acyl Carrier Protein j 

AMSLEDDVIAI IVEQLGVDPKEVNE^SSF IEDLNADSLDLTELIM7LEEKFAFEISEEDA 

CPn_(J2'.tb iiA'I'it jj-IOJJ 

CT296 hypothetical pro/ein 
K I P I RGMICMD I TLVCKKV I VTCCSRC IGLG I VKLFLENGADVE I WGLNEERGQAV I ESL 
TGLGGEVSFAR\^SHNGGVMDCVQKFLDKHNK I DI LVNNAG ITRDNLLMRMSEDDWQSV 
ISTNLTSLYYTCSSVI RHM LRARSGS I INVAS IVAK IGSAGQTNYAAAKAG 1 1 AFTKSLA 
KEVAARNIRVNCLAPGF I e/dMTSVLNDNLKAEWLKS I PLCRAGTPEDVARVALFLASQL 
S SYMTAQTLWDGGLTY 

CPn_0297 ^335724 334774 

fabD-Malonyl Acy/ Carrier Transcyclase 

SRS^^CDDNF^^KKRYA^FPGGGSQYVGMGQDLYMEYPEVRELFDFANERLGFSLTSIMTE 
GPEDLLMETVHSQLMYUlSMAVVKVt^QRSSIQPSLVSGLSL^ 

LEL\mKRGQLMNEApIQSPGAMAALLGLPSEVI EEN ITSLGQG IWIANYNAPKQLWAG I 
AEKVDQAI ELFRDLGCKKAVRLKVSG AFHTPLMCVAQDGLAPD I Y ALCMKDS SL PLVSHV 
VGKSLVNTEEMRBCLARQMTSPTLWYQSCYH I ES EVDEFLELG PGKVLACLNRS IG I SKP 
ITSLGTFAQIEKFLSEV 

CPn_0298 / 336742 335717 

fabH-Oxoaoy-1 Carrier Protein Synthase III 

YTSFFLYMWf SVNKNKKAA IWATGSYLPEKVLSNADLEKMVDTSDEWI VTRTG I KERRI A 
GP<3EYTSLMGAIAAEKAIANAGLSKDQIDCIIFSTAAPDYIFPSSGA1AQAHLGIEDVPT 
FDCQAACIGYLYG LSV AKA YVESGTYNHVLL I AADKLS S FVD YT DRNTC VLFG DGG AACV 
IGESRPGfiLEINRI^LGADGKLGELLSLPAGGSRCPASKETLQSGKHFIAMEGKEVFKHA 
VRRflETXAKHS I ALAG I QEED I DWFVPHQANER 1 1 D ALAKRF E I DESRVFKSVHKYGNTA 
assvg /aldelvhtes IKLDDYLLLVAFGGGLSWGAWLKQV 

CPnj6299 336726 337415 

recR- Recombination Protein 

LVYYSESLYSNLNLGPRPECKNKIH ITMTRYPDYLSKLI FFLRKLPGIGFKTAEKLA 
FEtlSWDSEQIiCILGNAFHWASERSHCPt^FTUCESKEADCHrcREERDNOSLCIVASP 
PVFFLERSKVFKGRYHVLGS LLSPITGKHI ENERLS ILKSRI ETLC PKE 1 1 LAI DATLE 
aDATALFUCQELQHFSVNISRIALGLPIGL^FDYVDSGTLARAFSGRHSY 

^CPru0300 337768 340152 

yaeT-Omp85 Analog 

GRIljGMLIMRNKVILOISILALIOTPLTLFSTFJCVKEGHWVTJSITIITEGENA 
PKIJCTRSGALFSQLDFDEDUIIIAKEYDSVEPKVEFSEGKTNIALHLIAKPSIRNIHISG 
NQVVPEHKILKTL^IYRMJLFEREKFLKGLDDLRTYYLKRGYFASSVDYSLEHNQEKGHI 
DVLIKINH3PCGKIKQLTFSGISRSEKSDIQEFIG/TKQHSTTTSWFTGAGLYHPDIVEQD 
SLAITNYLHNNGYADAIVNSHYDLDDKGNI LLYMDI DRG SRYTLGHVH IQGFEVLPXRL I 
EKQSQVGPNDLYCPDK IWDGAHK I KQTY AKYG Y I NTNVDVLF I PHATRP I YDVTYEVSEG 
SPYKVGLIKITGNTHTKSDVILHETSLFPGDTFNRLKLEDTEQRLR^YFOSVSVYTVR 
SQU)PMGNADQYRDIFVEVKETTTGNLGLFLG FSSLDNLFGG I ELS ESNFDLFGARNIFS 
KGFRCLRGGGEHLn^KANFGDKVTDYTU^KPHFLNTPWILGIELDKS INRALSKDYAV 
QTYGGNVSTTY I LNEHLKYGLFYRGSQTSLH EKRKFLLG PNI DSNKG FVSAAGVNLNYDS 
VDSPRTPTTGIRGGVTFEVSGLGGTYHFTKLSLNSS I YRKLTRKG I LKI KGEAQFIKFYS 
^^ITAEGVPVSERFFLGGETTVRGYKSFIIGPKYSATEPCOGLSSLLISEEFQYPLIRQPN 
I SAFVFLDSGFVGLQEYKI SUCDLRSSAGFGLRFDVM^INVPVMLGFGWPFRPTETLNGEK 
IDVSQRFFFALGGMF 

CPn_0301 340163 340762 

(OmpH-Like Outer Membrane Protein) 

IKDLSKEIFWFRKGFWYPFS I PKLVQVIMKKLLFSTFLLVLGSTSAAHANLGYVNLKRC 
LEESDIXjIGCETEELEAMKO^FVKNAEKIEEELTSIYNKLQDEDYMESLSDSASEELRKKF 
EDLSGEYNAYQSOYYQSINQS^^VKRIQKLIQEVKIAAESVRSKEKLEAILNEEAVLAIAP 
GTDKTTEI IAILNESFKKQN 

CPn_0302 . 340766 341866 

lpxD-UDP Glucosamine N-Acyltransferase 

SKFKEFSMSEAPVYTLKQLAELLQVEVQGNIETPISGVEDISQAQPHHIAFLDNEKYSSF 
LKNTKAGAI I LSRSQAMQHAHLKKNFL ITNES PSLTFQKC I ELF I EPVTSGF PG I HPTAV 
I H PTAR I EKNVT I E PYW I SQH AH IGS DTY IG AC SV IG AHSVLGANCL IHPKW I RERVL 
MGNRWVQPGAVI>GSCGFGYITNAFGHHKPLKHLGYVIVGDDVEIGANTTIDRGRFKNTV 
tHEGTKIDNQVQVAHHVEIGKHSIIVAQAGIAGSTKIGEHVI IGGOTGITGH ISIADHVI 
M I AQTGVTKS ITS PG I YGGAPARPYQ ETHRL I AK I RNLPKTEERLS KLEKQVRDLSTPSL 
AEIPSEI 

CPn_0303 342982 341921 

CT303 hypothetical protein 

REQKCLHHMDVSRKINRHTQFYVDSIDGVIKNFDHKPSEDKSRDHEELEEKLLTITKRIV 
ASAQEFONRKTDSKNYYLKKTQWLPFKNEELEQTKELFAMLTSMDKKIAQLFFYSPGCSS 
DWVEFTEVICHLNDSIGLGG\^LCCGLFECXXEHVVTVNKKLDLPLLLGTTVVNSLRYYL 
TYRNIS LLNCQ3MS ELGKELGDVLKQHGVAFT LIFKEIVDI DLLNYVK L IQG LKRSGN IQ 
AR I YDNDVPTLPSVSS S P I ALR YSLAm* I RGLALHVDFS S LK F I S PS I LSNT EHT AKALN 
SGGECF r FSNLDEFNLGMK I VMQLLRTGK rSPEI LNKN I MK I LMI KRRVRSLY I 

CPn_0304 .14(091 34^1 1SH 

pclhiA/orJpA-Pyruvar.p Dehydroqenan<- Alph.^ 

DOKPLPKRLPYKKVMDSSAPYNrAKtyrrEK.TTVEP. I LDLYGPA.SC I KFLKQMVLIREFEA 
RGEFj\YLECLVaiFYI!^YA^QEAVATAArANT'';LDPWVF^SYR('MAU\rLLNIPLOEIAA 
F.LLflK ET'ICALORGG.' iMHMl *G PHFPC ;r ;p( ; [ VC'VJ I PLAACAAPr I KYQROKNRVSLCFIC 

ir.MAtj ;vkmrtlnfv.*;mio[j , lml 1 1 ennt ;w,':Mf jT^lnravakop i ae;jo*.i:iuyd irav 
ivhk;i^>LKN::ijAiFRKAYHYMVMT.:;pvr.vi-:r-i^::;RFR(;H::i;jDPN[^i<::KEi-m:LFKK 
rjp[vi ( AK[WL[RLFA/r;rRKbTON[R0^KTAVLp;AFSNAKLr;r:np::srrT[,f-:Rf;vYA 

<:Pn J) ',<>', 144 142 MM ;7 

fj»lhU/o«ipB-ivinv.ir« Duhylrtj'jirr t.r.:>: intt.t 

i uap i \: vaav: ;cu) k ;aai .: k ilp r* 1 1 f-:pm: :wr w-r, \vmaxj i i : ;i i aakmi i i-'mti * ;kf: ;vp [ 
VFR^PtJf;AAAOv:n:un:;m:vK::f,YANrr<;f.n i Ar\';Nf 'YOak^m .k::a i rnnnpvt.flen 
KLEYN[.Kr;F:vi*PEKYi.vPi*:KAHRV0i-y;niJi.T[ iTY::uMv:;[TKivV::i.AKKRWf;L:;tEI 
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. tDLRT [Kr'LUI3TILS3VRKT3RCIVIEEGHYFAGIS ITEHVFDSLDAPPLRVC 
QKETPMPY:;K ILEQATLPNVNR [ LDT I EKVMR 

CPn_0306 J45L36 346431 

pdhC-Dihydrol ipoamide Acetylt ransferase 

GK FVI Z LLKMPK LS PTMEVCT I VKWHKKSNDQVSFGDVI VEI STDKA I LEHTANEDGWI R 
E ILRHECEK IV [CTPt AVLSTEANEPFNLEELLPKTEPSNLEA5PKG33EEVSPATTPQA 
AGATFTAVTFKPEPPLj.TPLVFKHVGTTNNLSPLAPOLAKEKNIDVSSIGGSGPGCRIVK 

km .»:".*. i - i •!-':' : .v :i- ;v ; ': : i ! r/rr <-,y. yheew.:: r- * ? hv r aap i /.y-y-.f : :; ; p> i fyyp.o?vy 
a: : f! .,* :ua v i:l.O/V.' : r ki .. : : rip * e vpa»:ai.ai yF.rT: • t m.7 iKnrr/r-f ir rvRrrrr ro [ J r a 

VAI PDGI ITPI IRCADRKNLCMI SAEIKSLAJ-KARNOSLQDTEYKGGJF CVSNLGMTGIT 
EFTAIVNPPOAAII^VGSVTEQALVT,DGEITIGSTCNLTLSVDHRVirx:YPAAMFMKRLQ 
KILEAPAVLLLN 

CPn_0307 ' 348999 346515 

glgP-Clycogen Phosphorylase 

NGC I VEDFSS FDKNKVSVDSMKRA I LDRLYLSWQS PESAS PRD I FTAVAKTVMEWLAKG 
WLKTQNGYYKNDVKR\/YYLSMEFLI£R£LKSNLL£^^ 

SDAG LGNGGLGRLAACYLDSMATLAVPAYGYG I R YD YG I FDQR IVNGYQ EEAPDEWLRYG 
NPWE I C RG EYL Y PVR FYG R V I H YT DS RG KQ VADLVDTQ EVLAMA YD I P I PGYGNDTVNSL 
RLWQAQSPRGFEFSYFNHGNYIQAIEDIALIENISRVLYPNDSITEGQELRLKQEYFLVS 
ATIODI IRR YTKTH ICLDNLADKWVQLNDTHPAJjG IAEMMH ILVDREELPWDKAWEMTT 
VI FNYTNHT I LPEALERWPLDLFS KLLPRHLE 1 1 YE INS RWLEKVGSRYPKNDDKRRSLS 
■I VEEGYQKR INMANLAWG S AKVNGVS SFHSQ LI KDTLFKEFYEFF PEKF INVTNGVT PR 
RWI ALCNPRLSKLLNET IGDRY I IDLSHLSL I RS FAEDSGFRDHWKGVKLKNKQDLTSR I 
YNEVGEIVDPNSLFDCHIKRIHEYKRQLMNIUWIYV^ 

APGYVMAKL 1 1 KL INSVADWNQDSRVNDKLKVLFLPNYRVSMAEH I IPGTDLSEQ I STA 
GMEASGTGNMKFALNGALT IGTMDGAN I EMAEH IGKENMF I FGLLEEQ I VQLRREYCPQT 
ICDKNPKIRQVLDLLEQGFFNSNDKI)LFKPIVHRJ^HEX;DPFFVIJ^LESYIAAHE>a^NK 
LFKEPDSWTKIS I YNTAGMGFFSSDRAIQDYARDIWHVPTKSCSGEGN 

CPn_0308 349213 349596 

No robust homolog present in Genebank/EMBL as of 11/7/98 

FFT0E^NMAWA(7rF>QTTQPQPSVSHKATHRYCSWVFFKPILVSLGUJ^ 

SGVTLSLCIGIVIJUQIVIJU3IALVIAFNHIRQTO 

LEDRYSSK 

CPn_0309 350977 349595 

CT309 hypothetical protein 

FMRAWEEFLLI^EKEIGTNTVDKWLRSLKVLCFT3ACNLYLE^ 
SGLV^MWPIRVHVTSVDKAAPFTKFJCQM^ 

Dl!(E?RVLQEFTKSPDENGGVTFTJPIYLFGPEGSGKTHl/lQSAIS\^RESGGKILYVSSDL 
FT EEHLVSAI RSGEMQKFRS FYRN I DALF I ED I EVFSGKS ATQEEFFHTFNSLHS EGKL IV 
VSi^YAPTOLVAVEDRLISRFEVCVAIPIHPLVQEGLRSFt^IROVERLSIRIQETALDFL 
IYffiSS^/KTLLHALNLLAKRVMYKKLSHQIXYEI3DVKTL^^ IR 
NV^QYYGVSQESILGRSQSREYVLPRQVAlfYFCROKI^LSYV^IGDVFSRDHSTVISSIR 
LI^gKI EENSHDIHMAIQDI SKNLNSLHKSLEFFPSEEMI I 

Cpfp)310 353472 351049 

6CtIM-60kDa Inner Membrane Protein 
YF£5ELSLIFRWQMNKRTLLFVSLIGIAFVGCQIFFGYNEra^ 

AVgSVGLSVASWDTDVNG E EHKNNYAVRVGDKLFLLHNG EAAQSVY S SG ESWS FVDHKCG 
FDl&HLALYRC£GSSFNPTT*nX3KWLPTNHEGLPVLWEF^ 

KDSgr FGTAL VFWRSG SDY I PLGLYDSREEKLVSLDLP ITRAVI FGNDQDSAKS SDTANH 
YVtiF^YMQIIVSEESGSIEGINLPFASTNlJKSIVNEIGFDRDLiASEKSPEALFPGLSSK 
LPDGQQAKNS IGGYYPLIJ^GLLSDSKKLLPLEYHALNWSGRELATPVALRYRVLSYTP 
HSlQLES LDRSVQKVYKLPENPEEKP YVF ETA ITLTKETEDVWVTSGVPEVE IMSNASAP 
T«^VIKKNKGSI^KVKXPKVK£PIAIRRGWPG*«L^^ 

LYJSGSTAPTRLSAISPKNQLYPVSKYPGYETLLPLPKDAGTHRFLVYAGPLAEPTLKVL 

DKp^QEKGENPE^n^SISFRGWAFITAPFAALLFIIMKFFKLVTGSWGISIILLTVFL 

KLLLYPLNAWS I RSMRRMQ ILSPYIQQ IQQKYKNEPKRAQME IMGLYKTNKVNP ITGCLP 

LLi^PFLIAMFDLLKSSFLLRGASFIPGWIDNLTAPDVLFSWQTSIWFIGNEFHLLPIL 

LGIWUXJKVTSLHKKGPVTDQQKQCWIGNMM^^ 

WQ$WITNKI LDSKHLKNEWLNNKKHR 

CPftLfi311 354453 353575 

CT3S1 hypothetical protein 

DMR^EMAVIYWDRSKIVWSFEPWSLRLTWYGVFFTVGIFLACLSARYLAI^ 
FSKSQLRVALENFFIYSILFIVPGARLAYVIFYGWSFYLQHPEEIIQIWHGGLSSHG 
GFLLWAAIFSWIYKKKISKLTFLFLTDLCGSVFGIAAFFIRLGNFWNQEIVGTPTSLP _ 
WFSDPMQG VQGVPVH PVQLYEG I SYLWSG I LYFLSYKRYLHLGKG YVTS I AC I SV^F I 
RFFAEYVKSHQGKVLAEDCLLTIGQILSI PLFLFGVALLI ICSLKARRHRSH I 

CPn_0312 354518 354976 

CT101 hypothetical protein 

CTMARN IKYFLI LFPG ILWISAGHKLLLKATAI AIJDPLSSFFTYCLLSMVSWGLAfiLKHR 
YLLSKTIRKQLSLSSEFFSQKITWIAYIKQTFISRRFLIMVIMIAFSLVLRRYIZNPQAL 
FVrRATVGYALIKTAIAYFSKLONALMENPEGN 

CPn_0313 354957 355355 

acpS-Acyl-carrier Protein Synthase 

wkilkeisansmeiihigtdiieisrireaiathgnrllnrifteaeokyc/ektdpips 

FAGR F AGKE AVAKALGTG I GSWAWK D I EVFKVSHG PEVLL PS HVY AK I G 16 KV I LS I S H 
CKEYATATAIALA 

CPn_03l4 356285 355353 

trxB-Thioredoxin Reductase 

M r HSRL 1 1 IGSGPSCYTAA I YASRALt^PLLFECFFSGIGGGOLMTTTyEVENFPGFPEG I 

lgpklmnnmkeqavrfgtktlaodi isvdfsvrpfilkskeetyscdaciiatgasakrl 

E I rcAONDEFWQKCVTACAVCDCAS P I FKNKDLYV lOGGDSALEEArA'LTR YGSHVYWH 
RRBKLftASKAMEARAONNEKITFLWNSE tVK ISGDS IVRUVDrKNVTTOEITTREAAGVF 
FA IGHK PNTDFLOCQLTLDESGY I VTEKOTSKTSVPGVFMCDVOPKYYRQAVTSAGSCC 
[AALDAF.RFLG 

H i tH»:;«ini.i I Ptott.-in 
Mf-KOAKYTW ::;KK [[.DNtECLTEDVAEFKDLLYTAMR IT.ISEDEl'DNEIO^A [LKfJTW 
1 UNKDK WVI JVf ;LK::R( ;v I PMMEF [ D:"riEf ;LVUjAEVrVYLD©AI- , .DFFA.-KWLrjREKATR 

«jf<(jwi-:y ilahckki ::: i vkcoitrkvkccmvd u ;meaflp< idnkk r KNLODYVCKvr 

KFK [ LK fHVKRRN t VV^RRELLEAF.R LSKKAEL IE*jIM ffJEJtRK^WKN [TDFGVFLDLD 
< : I DG U .1 1 1 T I WTWK R[ U I ( p; ; FWVEI .NOELEV r t L; J 7f;K EKf/R VAU iL KOK EH NPWEDIEK 
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KY PPCKRVLGK I VKLLPYC^^MEC I ECL I H I U EMSWVKN I VDP.^ET/VNKCDEVEAIV 
LT> rQKDECK ISLGLKQTErI^MJ EEKY P IGLHVNAE I KNLTNYGAFVELEPC I EGLIH 
I"DMSWIKKV.^HP.'3ELFKKGNSvHf\VILoVDKESKK rTLGVKQLoSNPWNEIEAMFPAGT 
VISCVVTKITAFGAFVELONGrEffiLIHVSELSDKPFAKIEDI ISIGENV3AKVIKLDPDH 
KKVSLSVKE"/LADNAYDQDSRTE/LDFKDSOGPKERKKKGK 

CPn_03l6 359TA94 360121 

nusA-N [Jtilization Prfitein A 

r w :.\ : i/r'-"MF.KRK" roR:™ r : "a : r-::'Af .k r AAKrr: .prow. 
r rr."hT. ;r* r -wr. r vk i • prr: :kk r ru -KAREYDfi \ ",m« :o ymdv p f*/" on fg r :,vv; 

AAKU: I*jUKL.RHAERUV I Y JttY KHKVNLTL^GWKH KAKG JNL I IDLGKVEAILPTRFYP 
KTEKHKrGDKIYALLYEVQtSENGGAEVIL^RSHAEFVKQLFIQEVPELEECSVEIVKIA 
REAGYRTKLAVRSSDPKTpPVGAFVGMRGSRVKNI IRELNDEKIDIVNYSPVSTELLONL 
LYPIEIQKIAILEDDKV^IVVNDADYATVIGKRGINARLISHILDYELEVQRMSEYNKL 
LE I ORLQLAEFDSPHLDC PLEMEG ISKLV I QNLEHAGYDT I RRVLLASANDLASVPG I SL 
ELAYKI LEQVSKYGESHVDEKPE I ED 

CPn_0317 / 360045 362750 

in£B-Initiation/ Factor-2 

SLLIRSL^KSANMElMaTKNLKUCIKNAQLTKAAGI^KIJCQKLJ^AGSS 

AKEKSVKVAI^TSfrPTASAEQASPESTSRJ^IRAKlTOSSFSSSEEESSAWrPVDTSEPAP 

VSIADPEPELEVVpEVCDESPE^PVA£VLPEOPVLPETPPQEKELEPKPVKPA£PKSVV 

MIKSKFGPTGKH J^LLAKTFKAPAKEEKWAGSKSTKPVASDKTGKPGTSEGGEQNNR 

KQFTJPANRSPASGPKRDAGKKNLTDFRDRSKKSDESLKAPTGRJDRYGLNEX^EEDRWRKK 

R VYKPKKHYDE«S IQR PTH I K I S LP I TVKDLAAEMK LKAS EV IQKL F I HGMT YWND I LD 

SErTAVQFIGLE/GCTIDIDYSEQDKiri*SNDTVl^EIQSTDPSKLVIRSPIVAFMGHVDH 

GKTTL I DSLRKSNVAAT EAG AI TQHMGAFCC ST PVG D I T I LDT PGH EAF S AM RARG AEVC 

DIWLWAGD^IKEXKLEAIEHAKAADIAIWAINXCDKPNFNSETIYROLSEINLLPE 

AWGGSTVTVOTSAjCTCEGLSELLEMIJ\I^A£VI^IJCA^ 

tvliq^sl4clgealvfndcygkvxtmhnehne PKAGDPFF 

WKNEKTARDI IEAJlSAGQQRPAIjC^KKRPNFDSMI^NXKTLKl^IKADVOGS I EALVSS 
ISKIKSEipDVEILTNSVGEISESDIRIAAASKAVLIGFHTGIESHAEPLIKSLGVR\/^ 

ftviyha/daikeimtslldpiaeekdegsaeikei frssqvgsiygcivteg IMTRNHK 

VRVUIN^ILV«GT1^SIJ«VKEDVKEVRKGLECG I LLEGYQOAQ IGDVLQCYEVI YHPQ 
KL 

CPn_0yQl8 362704 363126 

rbfA/Ribosome Binding Factor A 
VMSwVMKLSIIHKNYNLKYCMTEN^ 

RVSLS?CDL^SARVWSVMPHENTKEEALEALXVSAGFIAKRASKKN^^ 
IF^PQDYIENLLWQIQEKEKS 

Ctfn_0319 363133 363879 

fruB-tRNA Pseudouridine Synthase 
/ 1 FPGNLNT IKDMTMDLAVELKEG ILLVDKPQGRTSFSL IRALTKLIGVKK IGHAGTLDP 
FATGA7MVMLIGRKFTR15DII^FEDKEYEAIAHLGTTTDSYDCrc 
/LSAAEYFQGEIC^LPPMFSAKKVO^KKLYEYAjyCGLSIERHHSTVQVHL^ 
FWSCSKGTYIRSIA>IELGTMI/XGAYIXQIJUUJlSGRFSIDECIDGNLli)HPDFDISPY 
LRDAHGNSL 

CPn_0320 363824 364783 

ribF-FAD Synthase 

TT P I S I FLPTYEMPME I AYSLTSSFSVDSVTVGFFDGCHLGH SNLLS I LTSYSGSSGVI T 
FDSHPQTVLS LNHTKL INT KEERLQLLQT F P I DWLGVLTFDLNFANQSAEEFLTLLHRNL 
KCKRLI LGYDSC IGKEQQSNTEALDT IGKPLGIEVIKIP PYRMDNI WSSKA I RQFLSAG 
NLECAHRFLGHPYAISGKITEGSGIGGSLGFATINLPREESLIPLGVYACEIRYDSTTCQ 
GVMNLGTAPTFGRESLYAEAHI FSFAENLYGKEVS 1 1 PRKFLREEKKFQSKETL IRAIEK 
DI LDAQDWFAKGSFNYEGTA 

CPn_0321 365900 364767 

ychF-GTP Binding Protein 

YSKKHV I IFI FRCLMSHTECGI VGLPNVGKSGLFNALTGAQVASCNYPFCTI DPNVG IVP 
V I DERLEALAKI SNSQK 1 1 YADMKFVDI AGLVKGAS DGAG LGNRFLSH I RETHAIAHWR 
CFDDPDWHVSGKVNPVEDIEVINLELIFSDFSSAKNIHSKLEKLAKGKREVGALLPLFD 
TIIAHLEKGLPLRTLELTPEQIVALKPYPFLTMKPMFYIANVDESSLPDMDNDYVAAVRE 
VAAKENSKWP ICVR I EEE I VS LP IEERLEFLMSLGLEKSGLHRLVRAAYDTLGL ISYFT 
TGPQESPJVWTVVRGSSAWEAAGEIHTDIQKGFIRAZVITFEDMIECQGRAAARELGKLHI 
EGRDYIVQDGDTMLFLHN 

CPn_0322 366231 367328 

yscU-YopS Translocation Protein U 

SNLGNSMGEKTEKATPKRLRDARKKGQVAKSQDFPSAVTFIVSMFTAFSLSTFFFKHLGG 
FLVSHLSQAPTRHDPVITLFYLKNCLMLILTASLPLLGAVAWGVIVGFLIVGPTFSTEV 
FK PDI KK FN P I EN I KQKFK I KTL I EL I KS I LK I FGAAL I LY ITLKS KVSL 1 1 ETAGVS P I 
ITAOI FKE I FYKAVTS IG I FFLI VAI LDLVYQRHNFAKELKMEKFEVKQEFKDTEGNPEI 
KGRRRQ I AO E I AYEDS SSQVKH ASTWSNPKD I AVA IGYMPEKYKAPWI I AMG INLRAKR 
I LDEAEKYG I P I MRNVPLAHQLLDEGKELKF I PESTYEAIGE I LLY ITS LNAQNPNNKNT 
NQPDHL 

CPn_0323 367322 369460 

lcrD- Low Calcium Response D 
1 SFIMNKLLNFVSRTLGGDTAI J NMINK3SDLILALWM^K3VVLMIIIPLPPPIVDI1^ITINL 
S I SVFLLMVALY I PSALQLSVF PSLLL ITTMFRLG I NI SSSRQ I LLKAYAGHVIQAFGDF 
WGGNYWCFI IFLI IT I IOF I WTKGAERVAEVAARFRLDAMPGKQMA IDADLRAGMI D 
AT0ARDKRA0IQKESELYGAMDGAMKFIKGDVIAGIVI3LINIVCGLTIGVAMHGMDLAQ 
AAHVYTLLS IGDGLVSQ I PSLL I ALT AG rVTTRVSSDKNTNLGKEISTQLVKEPRALLLA 
GAATLGVGFFKGFPLWSFS I LAL I FVALG I LLLTKKSAACKKGCGSCASTTVGAACDGAA 
TVCDNPDDYGLTLPVI LELCKDL3KL I0HKTK3G0SFVDDMI PKMROALYODICIRYPGI 
HVRTDS PSLECYDYM I LLNEVPYVRG K I P PHHVLTN ET/EDNLSRYNLPF ITYKNAAGLPS 
AWVHEDAKA I LEKAA r KYVrPLE*/ 1 1 LHLGYFFHK.OSQEFLG IOEVRSMI EFMERS FPDL 
VKEVTRL r PLQKLTE r FKRLVQEQI3 r K DLRT f LE; j LH EWAQTEKDTVLLTEYVRiS SLKL 
Y I :;FK F3QG05A I SWLLDPE r EEM I RGA I K0T:JAf jSYLALDPDCVNL t LKGMRNT ITPT 
rA( '/ JQPPVLLTA IDVRR YVRKL I ETEFPD [ AV f :;YOF. ILPEIR rOPLCR IQ I F 

rpn_'>;24 ity-mw i/Of.H'H 

CT i?A hypothor icjl protein 

V WAI I R R 1 fMAASGGTGG LGOTC^J 1 /! ILAAV VJWA KADA/» VA/AT,Q EGiJEMNM I OO^ODLT 
NrAAATRTKKKEEKF0TLE:;RKKGE^f;KAKKK:;i-;:;TEEKrr/rD[J\DKYA:^NrJEI::GOEL 
KlXI'LAii:UDA^PEDIUVLVyEKIKDI>AI^:;'[7WJ5YLVO*rrPP:;U(«LKEALtOARWrHT 
tVhv;KTA K 1AKN t LKA: JQ E Y ADOLf/V.'J P: :i ;[ .F. ; ;j ; { \ .EW.im fTC'DOLLl'MLODRYTYOD 
MA rvr;.';FtWKCMATELKRO'!P'r/P:;AOWjVI 1AT i:\V.t Jf/jAVtTtlYDYFEjIRVP [LLDCLK 
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AFG rnTr^ULNFVKVAEnYHK I mDKFPTASKVERI 
^SRLF^ADKRC^AMIANALDAVNtNNEDYPKAJ 



EVlJ^Hff 



pVDSVTGVLNLFFSALR 



37U48 



CPn_0325 

GTLPEWFRERI rKAAL3VNCCFQSS IKG ILCYGEVTG^LYLSDILSMNYLNGEKLFEYL 
KLFSLHAKIWMESLRTONLPDLHVLGIYYVA 

„,.:.. W-nfT "■us*-- :■'.:/■'. 
mk^eSSenfpdltkkfhde^ffs^ 

A^^L^GEDLG II PQDVKTTLTHLG ICGTR I PRWERNWESDSAF I ^^^^^^Y33^y 

CPn_0327 372927 373211 

r™srkcSprS^ 
eenrflklkisasalrhidklglekvleraksknf 

CPn_0328 373220 374992 

LKYRE I FMSFLRRH I S LFRSQKQLIDVF APVS PNLELAE I HRRVI EDQGPALLFHNVIGS 
ARF^FPF^SMSSVNLD^ 

tyh5u»i?p^^ 

vvS^S^ 

FSETANDTLDYTGPSLNKGSKG I FMG IGKAI RI)LPHGYQGGK IHGVQDI APFCRGCLVLE 

^LEmCIICSLL^ 

ATHRPNY^PFVIDAIJ^PSYPKF^EVDPSTKQKVSE^WHAYFP^ETFYI 

CPjf'0329 375085 376146 

Pholholipase D Super family [leader (33) peptide) 

kwrqkdLkicvi^^ 

AI^ADEEI FLRIYNLSEPKIQQSLTRQAQAKNKVT IYYQKFKI PQ I IJCQASNVTI-VEQP 
PAGRKIJ^QKALSIDKKDAWIjGSANYT^SU 

k»*gkyfwdwciaiqavleki^ 

IfSftS H SKLTFKQLRQLN I NKDFVS I NT APCT LHHKF AVI DNKTLLAGS INWSKGRFSLN 
DfeStl ILENLTKCXJNQKIJ^IWKDtJ^SEHPTVDDEEKEIIEKSLPVEEQEAA 

Cp£o330 376930 376202 

C1GJ3 hypothetical protein 

FI&IEMIJXSRQLFSVLPSRFQDUiVYRFra^ 
RK£ gVRKRR EKNYLRI F RVL S R FDVMR 1 1 RFDPYGALS AQ S I AKD S RQN S PLVEK ^ ^ EE v 
ATNEA I RLALLA I GDREQEEKKQRHRYKLI^KQAKVLLSQLRHVHLDFKKLYCDSKKKB 
DQEKDEKNKOKRS IKVTKKKKG I SLGAAASQA IAAAAEAWVI ARNKGVLETASTLFYQKD^ 

Cfe0331 378452 376701 

I^IMWSGGG^PSSDre^PAI^GEQABGPS PUCES I FSETKQASSAAKQE£LVR 
SGSTGMYATESOINKAKYRKAODRSSTSPKSKLKCTFSICMRASVCCFMSGFGSRAS^VSA 
KR^DSGEGTSLLPTEMDVALKKGtTOISPEMC^FFLDASGMGGSSSDISQLSLE^lCSSA 
FS"GARSLSLSSSESSSVASEX3SFQKAIEPMSEEKVNAV^AIU^ 
L\Mr1aMATGNEGM I DLSDLGQEEVST AMTS PRAVEGKVKVSS SDSPEANPTG I^SNTLE 
PAE^ZAEKQESREQLSEDQMMLARAMAGLLTGAAP^EVLSNSWSGPSTVFPtfPKFSGTL 
PTOSSGDKS KHKS PG I EKSTNHTNFS PLREGTVKS AEVKSLPHPESMYRFPKCS IVSREE 
PEAWKESTAFKNPENSSQNFLPIAVESVFPKESGTGGALGSDAVSSSYHFLAQRGVSLL 
APLPRATDDYKEKLEAHKGPGGPPDPLIYQYRNVAVEPPIVLRSPQPFSGZSRLSVQGKP 
EAASVHDDGGGGNSGGFSGDQRRGSSGQKASRQEKKGKKLSTDI 

CPn_0332 378676 378536 

CHLTR T2 Protein 

YLDSRI RVI PLARQRCTLLHLLAVLC PP I S FFTQGVS PCVFFCFLDp 

CPn_0333 379117 378800 

VDFFVFVFFMCKPKK5RTDRALAQE IQKKSTEVLKKPAR IKAK^RRKFLI AKEQKTLKHR 
AQ EYDQLVRSLLDSQKK DTDKVL I FNYENG FVFTDKDHFS KYS^ RL 

CPn_0334 379308 379823 

CT079 similarity , 
TMSVH ITPRKCF I LCI LSMFTLPTLFPKAHLI LFS PY IVJfcFYCFSKDKGLVLALGCGVL 
-^DLALGSRGVFLLLYPLTALITHKAHL I FSKESKAALV WNMIFYGVFLLLT I PMCALFG 
H EVRWS I DVLM I PLKC SFLDNL IFTSVIYILPCA I NSGyT H KM I SF F RRLVC Y 

CPn_03 35 37980i? 380674 

tcrlD-Mechylene Tetrahydro folate Dehydrogenase 
E IGMLLFG I PAAEKI LQRLKEEI 5QS PTSPCLAVVL IGNDPASEVYVGMKVKKATEIG 1 1 
3KAHKL P3DSTLSCVLKL I ERLNQDPS I HG I LVQEPL PKHLDSEVI LQA ISPDKDVDGLH 
PVNMGKLLLCNFDGLLFCT PAG I I ELLNYYEI P/RCRHAAIVCRSN IVGKPLAALMMQKH 
KjTNrTVTVLH:;QSENLPE ELKTADr 1 1 AALGAfPLF I KETMVAPHAV I VDVGTTRVPADN 
AKG YT L U j DVD F WNWT KC AA I T TV PGGVG PM^T V AM LMSNTWROYQN FS 

f.T'Mj)": J*'* !H05t;9 

K-[!)KMKSNHS::::WI<RW::H 
i'YI< [ Vl/j"T:;i.:'AKt;KA:"t IDRCFHKflDlI [ YNNWNPYSEL.'T [ INRAPADVP [TLiiVEL 
SKFtlXjVryrLYKLoI&RFnt'rVGPLKT^ 
fri'KTLlKKNlllVOIDUYJWKr.YAVI^ 

i k: : kaa< rr r i ,o 1 1 a «a i at: ;< ;ni i [oip'VRf ;k r yth r ldtrtc.ki i .eussyp iqijvswh 

K ;t .'AYAfjA [ A'rVt.MTn\"K I EAKOWAEE1II i [ LTY niDGA.™ 
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CPn_03 37 J| 

^RYELRKLEGKIAQKGKTLIPIXIM^-'" R GYVWRLGCCRGKKAYDKRR . 1 1 EREKEREV 
AAAMKRRHH 

CPnlo3J8 "22?/ 383375 

r-f i v J * i\-t^t^r >i- i ■ " i 1 '"y-Mn 

P^ I QNALR F S LP AEQ LKTM UsRTS F A VS R EES R YVLTGVLLA I ANGVAT ^jj£'^^^^*^^^ — ^4!5 

^AE^KSFSCEYIIPI^^ 

E^roi^PVISTESNVKL^ 

GEG WSI^VNYSGELLE ^ftFNPF FFLD I LKHSKDELVS LG I SDSYNPG I ITDS ASG LFV I 
MPMRLHDD 

CPn 0339 / 393405 384034 

dtitfgsshffle9ofekdhlpqai^iytdkcokk 
ISrllis^pad^flnli^ccdnhytlclsyyhr^ 
ws^aptypsng^svvrnfqiypknfgltt 

CPn_0340 / 383842 384156 

plypllivLIsrssaekcs^ 
wsnnlke^lalkfkssliknsdisetavaeefhkqls islprdle 

CPn_034A 384160 384495 

hagi#ervgqi^paptiotlitsthmhgelpktslvlsienaqvseoi^ 

CPn£.0342 384619 385062 

valahpdcpeeakkeklfswllrtqglh 

CPn_0343 384999 385595 

LP^^I^ 

SLDVLILSGNRHSKFL PFRLPYENDGKVCT I ETKLDTPHKAYVI HT SHTYI I TNR^LYL 

mkeflkegnttpiiehvpeaalec/^^ 
pkkwslnqkneinpeklek 

CPtu0344 387432 385558 

RIGCIPFGGYVRIRGMERTKEKGEKGKIDSVYDI PC/3FFSKS PWKRILVLVAGPLA^LL 
AVL^SILYl^RSKNYSDCSKVVGVWPVIOA^ 
SL^HL^EIKRF^TVPSKEFAIDVEFDPTKFGVPC^ 
NSElAPNDRFVWMDGTLLFSMAQISQII^ESYAFVKVAi^ 

YLRNELIDTQYEAGLKGKWSSLYTLPWINSYGYIEGELTAIDPESPLPOPQERWI^ 

ILMDGTPVSGSVDILRLVQNHRVSIIVC^PQELEEVNSRDAXIKR^ 

LNHIXSESHPVEVAGPYRIiDPVQPR^ 

AEKQKPSLGISLICDLKVRYNPSPVVMLSNITKESLITLKALVTCHLSPQ^ 
LHTGWSVGFSEVLFWIGLISMNLAVI^PIPVU^ 
LVPFTFLLIIFFI FLTFQDLFRFFG 



CPn_0345 



388587 387436 



CT345 hypothetical protein 
LKVACLKHI^VWSTGSIGRCTLEI\mRYPSEFKIISMASYGNNLRU^FCX3^ 
AVYNEEVYNEAC0RFPHWFFLGQEGLTQLCIMI7rVTTWA^ 
ALALANKE I LVC AGELVS KT AK ENG I KVL P I D S EHN AL YQCL EGRT I EG I KKL 
PLU'HCSLEELSCWKQDVLiNHPIW^GSKVTVDSS 

VIHPQSLIHGMVEFU^SV'ISIMNPPDMLFPIQYALTAPERFASPRDCWDFS^OTLEFF 
PVDEERFPS I RLAQQVLEKQG S SGS FFNAANEVLVRRFLC EE I SWCD I LRKLTTLMECH K 
VYACHSLEDI LEVDGEARALAQE I 



CPn_0346 



389690 388704 



070-troD/ytgD-Integral Membrane Protein 
KKGSLMALGPSPYYGVSFFQFFSVFFSRLFSGSLFTGSLYIDDIQIIVFLAISCSGAFAG 
TFLVLRKKAMYANAVSHT\ : LFGLVCVCLFTHQLTTLSI/3TLTLAAMATAMLTGFLIYFIR 
^FKVSEESSTALVFSLLFSLSLVLLVFKTKNAHIGTELVLGNADSLTKEDIFPVTIVIL 

ANAVIT I FAFRS LVCSSFDSVFASSLGI P I RLVDYLI I ^0 L S^ LV ^^^^^^Y?^^f^^ 
LI I PSL I AKVIAKS I RSLM.\WSLVFSIGTAFLAPASSRAILSAYDLGLSTSG ISWFLTM 
M I WKF ISYFRGYFSKNFEK I SEKSSQY 

' CPn_0347 391078 389678 

069-troC/ytgC- Integral Membrane Protein TCt , rtDr T c 

TFGTNP EALSRKT IW I VLI MLSCVFS DT t FL3 SFLAVTL ICMTT ALWGT I LL I SKQPLLS 
ESLSHASYPGLLVGALMACWFSLQAGIFWIVLFGCAASVFGYGIIVFLGKVCKLHKDSA 
LC FVLWFFA IGVI LAS YVKESS PTL YNR I NAYLYGQAATLG FLEATLAA I VFCASLFAL 
WWWYRQ ryVTTFDKDFAVTCGLKTVLYEALSLI F I SLV I VSGVRSVG I VLI SAMFVAPS L 
GARQLSDRLST I LI LSAFFGG I SGALG3Y I SVAFTCRAI IGQQAVPVTLPTGPLWICAG 
LLAGLCLLF3PKSGWVIRFVRRKHFSFSKDOEHLLKVFWHISHNRLENISVRDFVCSYKY 
Q EY FG P K PF P RWR VQ I LEW RGT/K K EO D Y Y R LTKKG R3 E ALR LVRA H R LWES Y LVN S LD F 
5 K ES VH ELAEE I EHVLTEELDt ITLTEILNDPCYDPflPQt IPNKKKEV 

CPn_034^ 3^1815 1910;'.7 

0^8-troB/ytgB-ADC c ranaporr.^r ATP-Jse 
F.GWLLNVKDETFWSVHNLCV'N'i'EHAAVI.YM [^FnLGKn.jLTAIf/jPNCAGKSTLLKASLG 
LIK.P.CGGT/'^FFNQKFKKVRQP. rAYMP0RAr!VfArfDFPMTVI.,DLAf.jM'.;i'Yi3YKGMWGRIGS 
DDPREAFH ILERVGLElJVADROI^LSf ^;QQOPAFLA[<ALMOKADLYLWDELF:'AI DMAS 
FKToVnVWELRCOGKTI^VHHDLSIiVROLFDHWLLNKRLICCGPTDECM J*.IDT IFQT 
7GOE I ELLEOTLKLGRGKOFGC: 

'j'i7-t r'iA/yrqA-i-oltic^ t'rfjr..:in Umfltri'j l-'.imily 
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W I LKNA: :R EMUAKMC Y I FKVMRW I F r ; FVACG I TFGCTN^^ftftNSR PC I L5MNRM I KDC 
VERWCNHLATAVLIKGSLDPMAYEMVKGDKDKrAGSP^^BlLGLEHTLSLRKHLENN 
PNUVKLGER L I ARCAFVPLEEDG tCDPH I WMDL3 IWKEfl^^TEVLIEKFPEWSAEFKA 

NSEELVCEMS I LD3WAKQCL£TI PENLRYLV5GHNAFSTFTRRYLATPEEVASGAWRSRC 

isPEGLSPEAQtSVRDIMAWDYINEHDVSWFPEDT^^ 

KPLYSDNVDDNYFSTFKHNVCLITEELGGVALECQR 



OPn_0350 



miV) 391684 



EOLLKMLKTFILESETPRTINPKPKAPRGSKKRRDFINFTKTDIERVLELARyVGDKDLL 
AR FS PKKPLTSLKRELIRS I RNG IVSVELWNAYVEAVKAVSSPNLEVTSPFV 

CPn_0351 393861 395432 

adt -ADP/ATP Translocase _-.„«™„«™r 

KIKWQRVNMTKTEEKPFCKLRSFLWPIHTHELKKV^^ 

IVGAPGSGAEAIPFIKF^WPCAIIFMLI^^ 

VIYPLRDVLHPTEFADRI^AILPPGLI^LVAILRNVn'FAAFYVLAELWGS^ 

ANEITKIHE^RFTALFGIGANISLLASGRAIW^ 

IVSGLVt>iASYWWIM<^TDPRFYNPEEMQKGra^^ 

LLVI AYG IC INL I EVTWKSQLK LQYPNMNDYSEFMGNFSFWT WSVLIMLFVGGNVIRK 
FGWLTGALVTPVMVLLTG I VF FALVI FRNQASGLVAMFGTTPLMLAVWGAIQN I LSKST 
KYALFDSTKEMAYIPLDQEQK\nCGKAAIDWAARFGKSGGALIOQGLLVICGSIGAMTPY 
LAVILLF I IAIWLVSATKLNKLFLAQSALKEQEVAQEDSAPASS 

CPn 0352 395478 396830 

No 7obust homolog present in Genebank/EMBL as of ^J^B T 

WVGIFFINSHFTNSYAFFNQKVIITVRHSGCTMKCSPLTLVPHIFLKNDCECHRSCSLKI 

RTIARL I LGLVLALVS ALS FVFLAAPI SYAIGGTLALAA IVI L I ITLWALLAKSKVLP I 

PNEIXJKIIYNRYPKEVFYFVKTHSLTVNEIJCIFINCWKSGTDLPPNLHKKAEAFGIDI 

SIDLTLFPEFEEILWNCPLYWLSHFIDKTESVAGEIGL^QKWGLI^PLAFHKGYTT 

IFHSYTRPLLTLISESQYKFLYSKASKNQVroSPSVKKTCEEIFKELPHNMIFRKDVQGIS 

QFLFLFFSHGITV^AQMIQLir^OWKMLCOFDKAGGHCSMATFtMFIiJTET^ 

S^EPTVNFNTIWEUCVIXEKVKESPMHPASALVQKICWI^ 

WTSSLPQYAFHAQTYKLEKKIESSLPIRSSL 

CPn_0353 396893 397135 

No robust homolog present in Genebank/EMBL as of 11/7/98 
LRFRNIKKSLIFIKRIRYSQSGKEQKGARPFFKKSITSSLVILLLEAIFNENFSSIIQNN 

FNKNFKNKNI S INRI FVKFT I 

CPI3o354 397062 398507 

No robust homolog present in Genebank/EMBL as or 11/7/98 
YKfelKILKIKTFUJGFlANIJlWroiDEPRl^ 

TI^VISAILLCGALIAFLCVAAPVSYILSGALLGLGLLIALIGVILGIKKITPMISSKE 
QVFFQELVNRI RAHYPKFVSDFVSEAKPNLKDLISF IDLLNQLHSEVGSSTNYNVSEELQ 
QKlDTFEGIARLKNEVRTASLKRLESAASSRPlJPSLPKIWKVFPFFVrtXSEFI 
VEt^VKKIGGSLEEDLSDYIKPEMLPTYWLIPI^FRPTNSSII^HTLVLARVLTRX^ 
QHLKyAAIi*GEWNUWSDI,NTMK^ 

RY^QMSLIKWPADLWENLCCLTl^HTGRPQDMEFASLIGTLYTQGLIHKESEAFLSS 
LTt^SLWFKTIRRQSTNIAMFLENIATHNSTFRSLPPITVHPLKRSVFSQPEEDESSLL 

IG !0 

CP^S)355 399955 398591 

No':%ebust homolog present in Genebank/EMBL as of 11/7/98 
IRDFYLHI IYTAFNRS I SKELAMSMT I VPHALFKNHCECHSTFPLSSRT IVR IAI ASLFC 
IGALAALGCLAP PVSYIVGSVLAF I AFVI LSLVILALI FGEKKLPPTPRI I PDRFTHVI D 
EAtQLS I5AFVREQQVTLAEFRQFSTALLCNI SPEEKIKQLPSELRSKVESFGISRLAGD 
LEKNNWPI FEDLLSQTCPLYWLQKF I SAGDPQVCRDLGVPRECYGYYWLGPLGYSTAKAT 
IF^KETHHILC^LTKEDVLLLKNKALQEKWOTDEVKAIVERIYTTYTARGTLKTE 
KEflSKELLLLSLHGYSFDQLQLITQLPRDAWDWLCFVDNSTAYNLQLCALVGALSSQNL 
LDESS I DFDVNLGLYVIQDLKEAVQAFSASDEPKKELGKFLLRHLSSVSKRLESVLRQGL 
HR I ALEHGN ARARVYDVNFVTGAR I HRKTS IFFKD 

CPn^356 400465 400109 

Noitebust homolog present in Genebank/EMBL as of 11/7/98 
KQVQtFQYMNESGWDWLCDFDSQGEGFQLSRLVGLLHSSWALYEAKEQFYLPEVSLLT 
ELf£#QLLSKPTKHGVAKDLCl^EKHFQRFRQYI£^ 

CPn_0357 401341 400469 

No robust homolog present in Genebank/EMBL as of 11/7/98/ 
YSSKNGASMVN IQPVYRNTQVNYSQATQFSVCQPALSLI I VSWAAVLA I VALVCsQSLL 
S I ELGTALVLVS LILFASAMFM IYKMRQEPKELLI PKKIMEL IQEHYPS I WDFIRDQEV 
SIYEIHHLISILNKTNVFDKAPWLQEKLL/^FGIEKFKDVTiPSKLPNFEEILWHCPLHW 
LjGRLWPMVSDVTPGTYGYYWCGPLGLYENAPSLFERRSLLLLKKISFGEFALL^DGLKK 
NTWSS3ELVQIR0NLFTRYYADKEEVDEAELNADYEQFDSLLHLIFSHKLS 

CPn_0353 401757 401578 

No robust homolog present in Genebank/EMBL as at 11/7/98 
EEVLSVSMKLIPTQDSIERETDSKRDKKIFTIYICSSKVLAGHFFSHLDK^JKIHESIGV 

CPn_0i59 401994 403817 

LepA-GTPase 

ITLOY I LKEYKI ENIRNFS IIAHIDHGKSTIADRLLESTSTVEEREMRrfQLLDSMDLERE 

rg rrr KAHPVTMTYLY EGEVYQLNLI dtpghvdfsyevsrslsacegaLlivdaaogvqa 

OS LANVY LALERDLE 1 I PVLNK I DLP AAD PVR I AQQ I EDY IG LDTTNI I ACS AKTGQG I P 
A I LKAI IDLVPPPKAPAETELKALVFDSHYDPYVGIMVYVRI isceukkgdritfmaakg 
r>5 FEVLG TGAFLPKATF I ECSLR PCQVGFF I ANLKKVKDVK IGDTWKTKHPAKTPLEGF 
KErNPWFAGIYPIDSnDFDTLKDALGRLQLNDSALTlEOESSHSttGFCFRCGFLGLLHL 
KI IFEP.I [ R EFDLD 1 1 ATAPSV I YKWLKNGKVLDI DNPSGY PDBA 1 1 EHVEEPWVHVN I 
tTTOEYLrjrUMNLX:LDKRGIWKTEMLDQHRL\^V/ELPLNEI^DFNDKLKSVTKGYGS 
KDYRL/JDYHKC:; [ t KLEVL rNEEP t DAFSCLVHRDKAESRGRS KTEKLVDVI PQQLFKI P 
I QAA r r I Y KV [ AR ET \ R A [ & KNVTAKC YGGDITRKRK LWEKQK J/IKK RMK EFG KVS I PNT A 
V I KVLKLD 

r[M J) •H) r ))i*-1 Al)V)22 
CTif.n hyfmt:h»:r io.iL proroin 
vAE.(/rti f;t.r< ;lavmc;knlvlnm idhcfsv^wnrtpektjAdflkeypnurelvgfe^le 

M-VN:;iJ-:ftM(K[MLWlUAi;KWDO:;rn^^ 

i : l e .Ft / vi' ; t : ;< a :e w jarik ; r<:; tm pcgnpeawplvap r FOf; laakvc* ;rpcc:-;wvgtcgac 

MYVKAVHf k; TRYf IP [(JL U!F.AY<"i [ I.RDKLKLjATAVAT/LKEWNTLELESYL IR iagevl 
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A r vn prr rPviDT:u3vvG(^^»^AiDAi^^r/rL:;Li :ga^uArf-:;:,-wkeireoa 

^^PUfS^DP3VF^P/ALYA3K 1 1 SYAQGFMLU^EASKEYHWGLOLGEIA 

IPCLAAA IT FYDGY RT AS S SMS LA^GLR D Y FG AHTY ERN DRPRGEFYHT CWVHT KTT ER V 
K 

CPn_03bl 40S65/0 405 362 

^''i'ki-laaL' ; rV;- v ■ / ;mv- ;i-p:mK'.:^-:k:;;. !..•:'.•:•- "sr:."'.* :TACL.PYL?ty/ 

TLVNNADWLQE I Ibr UJu/'JKi I KK LaajML V KL r " I KyRVH J [iEG I JYTEr J YLI LQS YD 
FY HL FKNYGT I LCCOG S DQ^GN ITSGIDFIRRKG LGQ A YG LT Y P LLTNAQGKK I GKT ESG 
TVWLDSDLTS PF^YQYLLRLPDDT I PKIARTLTLLSNEEIQDIDRRVQTDPVAVKEFVA 
QD I LSAIHGDLGLEEALSjrTRSMH PGNLSSLSEKDFHELFAGGMGASLDKSEVLGKRWLD 
LFLVLGLCKSKGEIRRL^QKGVYINNVPIANEHSVCEEQDICYGHYVLLAQGKKRKLVL 

YLN 

- CPn_0362 / 407843 407055 

fliA/rpsD- Sigma/2 8 /WhiG Family 
UJKKKFVKTCtfTQN/lFAWFY^ 

DLYASCA^LVRA^RYNPERSRRFEGYAVFLIKAAIIDDLRKQDWVPRSVHQKANKLSG 
AMDSLROSt^KEPTDLELCEYDJISC^ELSGWFVSARPALIVSL^EEWPSQSDEGAGMAL 
EERIPDE^ETGYy6vWK0EFSIX:LANArOELEEKERKVMALYYYE 
ESRVSQ IHSKALLKLRAALSAFR 

CPn_0363 / 409700 407943 

flhA-Flage/lar Secretion Protein 

eavfvsgkkdgvko^ifvplsilvliflplpoii^dfglcisfalsllwcv^^lnssn 
saklfppffTy^uj^u^strwivssgtasslivslgsffslgslwaatfaci^ 

FVNFtWS^SERIAEVRSRFFLEALPAKC«AU)SDLVSGRASYKAVKKQKNALIEEGDF 
FSAMEGVFRFVKGDAI ISCILXLVhIWSVTCLYYTSGYAiEQKWFTVLGDALVSOVPALL 

tscaaat/i skidkeesllnylfeyykqlrqhfrwsllifslcci PSSPKFPIVLLASL 

LWLAYRi^EEPAS EDSCIE1RAFSYVIXJACPKEOESOFYQVYRAASEEWEDLGVRLPVLTS 
U^IEEIRPWLRVFGQNVYLDEMTPEAVLPFLJ^IA^ 

IVPKK^LSSLVVI^RIXVRERVSUa.FPKILF^VAVYQNSGDSLEILAEKVRKSLG^ 

grslwtokqtlevitidfhveelinssysksnpvmq 

SCET^EMKKMLDPHFPDLLVLSHDELPKE I PI SFLG I VSDEVLVP 
CPn/0364 409954 410238 

fe ?SMA^VITSDDEC^EFELEDNSEIAEPCESM^ 
TEPEYDFLGEPEDSNERLACQCRIKGGCVKVTF 

rfPn_0365 410498 411544 

io robust homolog present in Genebank/EMBL as of 11/7/98 
/ FKGTQVNSLIMATISPISLTVDHPLVCTKKKSCSNFDKIQSRILLITAIFAVLVTIGTLL 
IGLLLNI PVIYFLTG I SFIAWLSNF ILYKRATTLLKPRACGKHKE I KPKRVSTNLQYSS 
I S I AINRS KENWEHQPKDLQNL PAPS ALLTDN PYEIWKAKHS LFSLVSLLPGGNPEHLLI 
SASENLGKTLLIEETSQNAPISSYVDTTPSPKSLLNEAIQETRVEIhn'ELPAGDSGER^ 
PDFRGRVFLPOI PTTPEAIYQYYYALYVTY IQTAINTNTQ I IQI PLYSLREHLYSREL 
■ Pj'QSRMQQSLAMITAVXYMAELHPEYPLTIACVERSLAQLPOESIEDLS 

/CPn_0366 411976 412440 

No robust homolog present in Genebank/EMBL as of 11/7/98 
MGYLPVS ATDVLFES PAAPLINS ANTQNQKL I ELKG KQQAES S PRT ITSVI LEVLLVIGC 

. CLIVLSLLAIRPALQFTLETGHPAAIAVLAVSGTILLVAVI ILFCFLAAVPFAAKKTYKY 
VKTVDDYASWHSHQQTPTLGT I FSG IVYAESQAQL 

CPn_0367 413078 413836 

No robust homolog present in Genebank/EMBL as of 11/7/98 

S FPLNRYFMTKTTSI PDVHENQSHLSVDERLI SESPVLTKKEVI AKI IKLTALI LALAI A 

VGTAWAGVLGMPL^IATGAALLAAWLSCLLLRRREPSKCT 

VQPSVPLDYQKLLRNEWTLVNTLSEINISOTLQDPNQRYYWE^C<3APITLVATTGDIAK 
PRLKTSGRVMIVNAANSbWQSGGAGTNAALSAATHPTCWNOTRTSGGKINTGKGLSVGEC 
RSAPWINRDWTNK 

CPn_0368 413766 414107 

No robust homolog present in Genebank/EMBL as of 11/7/98. , 
TLAKDYLWVNAAOHPGS I ETGR INirrNPG EAHFLAQLLG PKY EGELKAH PEKLSNVI KKA 
YLNCFDEALNNQATWQVPLI SSS IYSPGGKLELEPVNQTKPNSSAYKLYH I RT 

CPn_0369 414345 415562 

CT058 hypothetical protein_2 

NIMTDSNPLPSYTDASLYRTPAKHSYPIRLPLNRTDRIEKILKIVTLTLALACALGFSIA 
AGILAMPIFSAVWITLAIAAVSLYSLLKKPKLYEILPOIEPESEQSSLSPSPQPPEOQD 
LPLQIDPLPDPESLPEVSLJU5LTTPPEELTAITVTPGYEALLEQNWDLLPSLAAVDPSFT 
T ETPQQPCF rWKLKDSKLI F I STSGD I AVPR I KTQGRVM I VNAANEN I SREGGGTNKALS 
LATSLC^WNASRLFRAHSRSGSQLQPGECRSAKWENSDHTSNDHVPGKAHFLAOLLGPEA 
AKCNNDPKQAFEV5KKAFHNLFQEAEI tGVDVIQLPLIGCNLFAPSRLLNLGKTRAEWI E 
AIKLALITSLQDFGWEQDNCEEQKI I ILTDKDQPPI I PPRFDLTTP 

CPn_0370 415755 416912 

CT058 hypothetical protein_3 

KRIFFKLFVFYLKSFMSTTEPNLTNVNLTMLI5SESMPTQLASHKLKGLDLVAFILIIGI 
AVSSGTAAI ILGIFLLFILTALAVLAFSILLYFLLREPKSPISVTHQPTPI IKDTDLPPV 
PPLALTPVPTEA^EEPPLPSPRTHQTLLO»iVroRIPDWANTOMPFIAADNffrcYAWHL 
KNSNLTLISTLGFIEKPRYKTQG IVMIVNAAT PNMANNfVKGTSLALAKATSVRCWENSKK 
SPDPLRSKQPLOLGECRSAKVENLflGTTNAGKAGLPOFLGOLLGPKASDYNYNPNDAFTF 
CRQAYLNCLNEAKRRKTTV\-OLPLLr:Sf!Fpr/:PKDEErTTSLRLC^irDGVKLALrDALOTF 
GSEAENQNOPWVT I LTTLARH PL ITP 

CPn_037t 4 1714 1 4 17^0; 

no robust hom^lou present: in (;f:n<.*r^iiik/KMBL .is; ot ll/7/OS 
KTMPVr3SAPLPT^MRPr>;iGNLx^LMF.r j rJ.':KAr,KAKII0nKTTKTrKr,[.VK I LVA I LVI EVLG 
[ [AAFFIPGTPP ICLI IUA*[*n,Tr/l/ .'V[,[,l,7r KLALVffKTRTrrAKOO IKRKLSGKG I 



t:pn_0 172 'U 'u^l 4 I MOM 

No caburx hom^hM pnv;crir. in <;.:ii< : n»iik/l\M[U, ol Ll/7/'»H 
NYRACHRN It ;NHM^^PVVTOT:";. , >Ar;f J 7[ i ;fJTKI/;i';Ft 1 KI<I J .'/;:'f ',(<( : I K [ AFAA.'JTALLLLN 

•nvrjG ivAiAMt f\*ats\a;ay rr/ v:, vlvi a.::\,i lla t mi . i : ;myk f -I't i rsown ■ [ sn 
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OPnJ)37l 418)56 420218 

ucpE 

NSE r FE I FMTL rTPAINSSRRKTHTVR IGNLY IGSDHS I KTOSMTTTLTTDIDSTVEQIY 
ALAEHNCD I VRVTVOC r KEAQAC EK I KERL I ALGLN I PLVAD IHFFPQAAMLVADFADKV 
R I NPGNY I DKRNMFKGTK I YTEASYAQSLLRLEEKFAPLVEKCKRLCKAMRIGVNHGSLS 
ER I MQKYGDT T EGMVAS A r EY 1 AVCEKLNYRDWFSMKS SNPK IMVTAYRQLAKDLDARG 
WLYPLHLGVTEAGMGVDGI IKSAVGIGTLLAEGLGDTIRCSLTGCPTTEIPVCDSLLRHT 

;,'?"!}]■< .-i-f NyA^ri'K' w/[■^■!■:r.K^AIMTIlvr,;'Fl^^^,VFH^^MC•v['^■LY"l!NRE^rfD. r ;pA 
vhuaVkvhfhasdpfihtsr^ffekqghqgkptklvfsrdfdnkeeaaisiatefgalll 

DGLG EAWLDL PNL PLQDVLK I AFGTLQNAGVRLVKT EY ISC PMCGRTLFDLEEVTTR I R 
KRTQHLPGLKIAIMGCIVNGPGEMADADFGFVGSKTGMIDLYVKHTCVKAHIPMEDAEEE 
LIRLLQEHGVWKDPEETKLTV 

CPn_0374 420209 420961 

CT056 hypothec ical protein 

VDSMTLSFHTHPLNYWTFEEFDGLP I RHGVFSKQKDAEGTV7AAKNPEIASAL0SPKYCD 
LHORHGTSVROTPTSPTYQPAIXJLCTOSPIXSLHIRHSDCQAAIFTDR^H^IANVHSG 
WRGLLGNIYAVTVGTMKKLFHTKPQDLFVAIGPSIGPDYAIYPDYATLFPRSFLPFMNPK 
NH FDLRAI ARKQLTNLG ISKDRIF I S DLCTYT EHDAFFS SRYLAHHPDPrVLTGQHSKNRN 
NVTAVLLLPRD 

CPn_0375 421112 421615 

No robust homolog present in Genebank/EMBL as of 11/7/98 
RLSMKI£ASTNHKVHEmPKKAQIJVEIEANKTQATE^ 

LMLAAG ITFVTFLALGFPLIQAYS IAGI ITLVGLAIGLVLLILSLLPKEDEEADALSRNA 
LLPLT I IVI EQQPITPKPE I PYSYLTKLALLTSLFLTLRRSSSQRKTH 

CPn_0376 421680 422294 

No robust homolog present in Genebank/EMBL as of 11/7/98 

FKWTAKAPNLTEIRDHGARWSLFLLSPETSHWKGDKEVSAPLKQLQDLI/3EEQWEAMK 

TKMNSRKKAGQWAIFNSPTPGVSSTLVl^OTPV^ 

EFLKNLFVDLLE^FTSVHIHAEEAFTPLDHTGKPHFKRDNVYLPGK^^ 

VSADTQFTLFLTQDECNPFHDKKRG 

CPn_0377 423441 422347 

sucB-Dihydrolipoamide Succinyltransf erase 

I HTTEVR I PNI AES I S EVTVASLLVT EGAL IQENQGLLE I ES DKVNQL I YAPVSGR I FWE 
VS BGDWPVGGWGK I EPAG EG EELG DSQ S KET I EAE 1 1 C F PQSGVRQS P PENKTF I PLR 
DQHDQGSQGLSAGDRGCTRERMTSIRKTISRitLI^AL^^ 

KQEEFLS RYGVKLGFMS FFVKAVLEALKAYPRVNAY I DG EE IVYRHYYD I S I AVG I DRGL 
VVPVIRDCDKLSiNGEIEQKIADLALRAREGI^IAELEGG^ IN 
PPQVG I LGMHK I EKRPWLDNEIVI ADMMYVALSYDHRL IDGKEAVGFLVKVKEGLENPA 

c|fi0378 426195 423445 

sucA-Oxoglutarate Dehydrogenase 

I^F^EF^FMDSEFVGQWSSOMWIESMYQRFMNHETLDPSWKYFFEGYQI/jQAASPSE 

AS^ISGNETIAMLQEQKSQFICTIYRYYGYLQSQISTLAPTTDSRFIQEKIAKIDLDEQ 

VP5AGLLPKAQVSVRELIEAIJ<KCYCGSLTLETLTCTPE]^ 

Utsm>LCKATFFEEFLQIKFTCQKRFSLEGGCTLVPMLE^ 

RGRJiNVLTNVLGKPYRYVFMEFEDDPAARGLESVGDVKYHK^ 

NA3HLESVDPIVEGWAAI£HC£HAGKEQSSLAILV^^^ 

Sf EGTLH IWNNYIGFTAVPRESRST PYCTD I AKMLG I PVFRVNSEDWAC I EAIEYALQ 
VrjjfifeFSCDVI IDLCCYRKYGHNESDDPSVTAPLLYDQ IKRKKS IRELFRQYLLEGQFAD I 
SgETLASIEKEIQESLNREFQVUCGTDPEPFPKKECm^ 

LE^SSPXCGFPDNFHPHPKIKTLLEKRMKMAEGGVGYDWAMAEEIJVFASl^IEGYNLRL 
SG^DSIRGTFSQRHLVWSDTVTGDTYSPLYHLSAEGXjSVEMYNSPLSEYAILGFEYGYAQ 
QA^LVLWEAQFGDF^AQIIFDQYISSGIQKWDLHSDIVUXPHGYEGQGPEHSSSR 
I ESRYLQLAANWNFQWLPSTPVQYFR I LREHAKRDLSLPLVI FTPKLLLRYPQCVSS I EE 
FTEPGG FRA I LEDADPNYDAS I LVLC SGK I YYDYAEMLPQDRRKDFSCLR I ESLYPLALE 
DI^^IDKYSHLKHFWI^EESKNMGAYDYMFMALQDILPEKLLYIGRPRSSSTASGSAK i 
LSRQELVTCMETLFSLR 

CPnr0379 426268 426765 

CT053 hypothetical protein 
KNKKMLCTCSR IQDGNPWMKSERLKKLES ELHDLTQWMQLGLVPKKE I SRHQEEIR I 
KIYEEKERLQLLKENGEIEEYVTPRRSPAKTVYPDGPSMSDIEFVEPTETEIDIDP 
ELELTDEGREDGAVEVDYSHEDDEDPFSDRNRWRRGGIIDPDANEW 

CPn_0380 426671 427876 

hemN-Coproporphyrinogen III Oxidase 
KSTIPTKTMKTLSAIAIAGDAWSLIPMLMNGKAPLALYIHIPFCTKKCRYCS^mPYK 
S ESVS LYCNAV I QEG LRKLAP IQETH F I ETVFFGGGT PS LVS PLDLKR I LKELAPHARE I 
TLEANPENLTVSYLROLQETPINRISVGVC^FDDSILQLUSRTHSSSAAITAIiOECONHG 
FSNLSIDLIYGLPTQSLEIFLSDLHQALTLPITHISLYNLTIDPHTSFYKHRKILVPTIA 
QEEILAEMSLLAENLLLSQGFQRYELASYAKPDYPAKHNLYYWTDRPFLGLCVSASQYLH 
GERSK^SHISHYLRA\mKNLPTQETSEILPKKERIKEALALPXRLLEEA5fLAEFPSTLI 
SMLTQDVKLQNLFSVHGQCLALNRQGRLFHDTIAEEIMGYSF 

CPn_033l 429836 428037 

CT326 similarity 
SLPNKFRALMTAPTESRSSPPTLLEETEPLSPNPIPADIQIPRITIS^PSLDVSTVASSA 
ED I 3VF I AGG PR S SS S ASVASDVY ELVC LCGG DED P E P PDS EVRT LYVNGSWCTHQ EAVQ 
EL^YISEVRGEAWLiYNEXSSGMSPWPISECRTLPTLDHPLCOALl/vWBQFFSAPENQN 
RE FLV I FYGDAS PY IQQALTQSRHSPRIVWG I SPTVFIQGDFRVHNYRVSGDFFSSLDC 
RGTRAENTT I LPYSSCLEGVFLPS I RCPS FTWAWFGEQCLVANBCEDVEDRGGLSQDAE 
R.SOLPHnERDL/WVrDSTDPSSMSRLVEWLNQGSPSSDMEINPYf'QRCPDVALSALYAIS 
]<V:;OLAOEWIL^\r:VHEGLDLOrCYSLILMHTTFAVRYFFIXFTnYPOSRERFRTARIVAQ 
: ri.VI.PfJ I LVLVFtXJCINVLRKLWMPOEILRAIF ISASTISGS lyFVECTRWMGROiLRHRVQ 
'JFVOQR V IGSGLPVGTVRASYRDRAGF r IGFLOTVHGGLYLRvn IMVLNQ t A IQVPR I LV 
l'F-flfrrAWDL[]NK5;AEENW:^GDVLAVGOTUJFrLCAFVLF)fr4LWFFV^KSVLRHSRRRRR 

'.'I'NjMH.! 4 307S2 4 100jfi 

y . 1 1 / .7 y t ■ 1 1 . * SAM - Dt ;[ k fruit 1 ! it M» :t hy r. r.inr, t f. r.is/ 

I J VTL/^L[iPNTt/;TRAVFrTLP:jVIGELVMRL[XJLIVEnDB^jGRAFLS[ J WK [PEVI1KFPLAI 
I ; KH AR [ . I 'K AWD F Y LE f ' t VKHCJ ENWG L I 3 DAC iLPC I ADPGAi: L VRR ARAU J t PVQAF.SC P 

r ti .a t w lj< ; l v:\Qr. ftfl* i y lpo: j pk ervk a r kk aat s k r v.its vi : t etc; y rnvytfe 

::LEJyrt l P:;YAI-:t,(:VA:n)L:;GPi;ELVLTROVOSWRTTSDLGCVKfJ.S[TKVPTrFLFHrPN 
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CPn_038 3 4l 
CT047 hypothetical P^^v" # 

VQDTTFLTLPMQKSLT3FDOF^^^YAEKyPAIALIG.'»ALEDDKDAL [ELLVGESFKELGG 
QGLMPATLMSWTETFALFQEHETLGIIffiEKFPLATKEFI^RYARNPQPHLTILIFTTKQ 
ECFRELSKALPSAL3LSLFGEWPADRCiRI IRLLLQRAERVC rSCSCSLASLFLRALAST 

slpdilsefdkllcsvgkktsldhsdXkelwkkekaslwkfrdsllkrdpvbghqqlhf 

LLEDGEDPLGIITFLRTCCLYGLRS DEEGSKENKHRMF/LYGKERLHQALN3LFYAETLI 
KNNVQDPlVAVETLVrRMVNL 

hctB-Histone-like Protein J 
VITCLI RGI KMIGAQKKOSCKKT^RAWKPAKK'/AAKRTVKKATVRKTAVKKPAVRKTA 
AKKWAKKTTAKRTVRJCTVAK}OPAVKKVAAJCRVVKKTVAKKTT AKKPVARK 
TTVAKG S PKKAAAC ALAC H KNJfl KHTS SC KR VC S ST ATRKHG S KS R VRT AHGWRHQL I KMM 
SR 

CPn_0385 4/4042 432522 

pepA-Leucyl Aminopeptidase A 

FLVIKGEFVVlJ'HAQASGflNRVKADAIVLPFVWFKDAKNAASFEAEFEPSYLPALENFQG 
KTGE I EU.YSSPKAKEKRIVUXJI£KNEELTSDW 

I S ELRLS AEEFLVGLS SvJ I LSLNYDY PR YNKVDRNL ET P LS K VTV I G IV PKMADA I FRKE 

AAIFEGVYLTRDLVNRNADEITPKKLAEVALNLGKEFPS IDTKVLGKpAIAKEKMGLLLA 

VSKGSCVDPHFIWRYQGRPKSKDHTVLIGKGVTFDSGGLDLK^ 

VLG I LS ALA VLELP tfNVTG 1 1 P AT EN A I DG AS YKMG DVYVGMSGLSVEICSTDAEGRLIL 

ADAITYAIJCYCKPTyRIIDFATLTGAMWSI/jEEVAGFFSNNbvi^^ 

RLPLVKKYDKTLH^IADMKNLGS^^RAGAITAALFLQRFLEESSVAWAHLDIAGTAY>^EK 

EEDRYPKYASGFGVRS ILYYLENSLSK 

CPn_0386 / 434543 434046 

ssb-SS DNA binding Protein 

KSKGYIJ^F^HF AGYLGAD PEERMTS KGKRVI TLRLGVKTR VGMKD ETVWCKCN IWHNR Y 
DKMLPYUOCaSGVIVAGDISVESYMSKDGSPQSSLVISVDSLKFSPFGRNEGSRSPSLED 
NHQQVG YESVSVGFEG EALDAEA I KDKDMY AGYGC EQG YVC EDVPF 

CPru0387 435229 434699 

CT043 hypothetical protein 

GDSLMSRQNAEENIJCNFAKELKLPDVAFDQNNTCILFVDGEFSLHLTYEEHSD 
PXYVY7PLIJX3LPDNTQRiCIJU,YEKIXEGSMLaKJ^^ I LMHCVLDMKY 

FAQLFIETVVKWRTVCADICAGR£PSVDTMPQMPCXXXX^ 

0388 435323 437320 

-Glycogen Hydrolase (debranching) 
_KVSSYPSVPLPLGASKISPNRYRFALYASOATEVILALTDENSEVIEVPLYPDTHR 
^IWHIEIEGISDQSSYAFRVHGPKKHGMQYSFKEYLADPYAKNIHSPQSFGSRKKQGD 

EPFPWDGDQPI^LPKEEMIIYEMHVRSFTQSSSSRVHAPGTFLGIIEKIDHL 

]Hklc INAVELLPI FEFDETAHPFRNSKFPYLCNYWGYAPLNF FS PCRRY AY AS DPCAPS R 
FKTLVKTLHQEGIEVILDVVFrWGLOGTTCSLPWIDTPSYYILDACG 
f LNTNRAPTTQWILDILRYWVEEMHVDGFRFDLASWSRGPSGSPLOFAP 
ASTK 1 1 AEPWDAGGLYQVG YF PTLS PRWS EWNGPYRDNVKAF LNGDQNL IGT F ASR I SGS 
QDIYPHGSPTNSIhJYVSCHDGFTLCDTVTYNHKHTJEA 

ED PG I LEVRERQ LRNF FLTLMVSQG IPMIQSGD EY AHT AEGNNNRWALDSNANY FLWDQ L 
TAKPTLMHFLCDLIAFPJOrnO , LFNRGFLSNKEISWVDAMGNP>rrWRPGNFI^ 
s. AHVYVAFHVGAQDQLATLPKAS SNFLPYQ IVAESQQG FVPQNVAT PTVS LQPHTTL I AI S 



£pn_0389 438254 437319 

T041 hypothetical protein 
' TVF^^FKRFYQKDSQRQ^Of^TCIJlPFKKTCKELIEFRRRTVKLLKNVLLGLFFSMS ISGF 
S EVKVSDT FVKQDTWEPK I RVLLSNESTTAL I EAKGPYR I YGDNVLLDTA IOGQRCWH 
ALYEG I RWGEFYPGLOCLK I EPVDDTASLFFNGIQYQGS LYVHRKDNHC IMVSNEVT I ED 
YLKSVLSIKYLEELDKEALSAC I ILERTALYEKLLARNPQNFWHVKAEEEGYAGFGVTKQ 
FTGVEEAIDOTARLVVDSPCGLIIDACGLLQSNV^ 
ESWNEELDGEIR 

CPn_0390 439171 438134 

ruvB-Holliday Junction Helicase 

RKSDREGSYMTHQVAVLHQDKKFDVSLRPKGLEEFYGQHHLKERLDLFLCAALQRGEVPG 
HCLFFG PPGLGKTSLAH IVAYTVGKGLVLASG PQL I KPSDLLGLLTSLQEGDVFF I DE I H 
RMGKVAEEYLYSAMEDFKVDITIDSGPGARSVRVDLAPFTLVGATTRSGMLSEPLRARFA 
FSAi^LSYYSDODLKEILVRSSHLLGIEADSSAIXEIAKRSRGTPRLANHLI*RWVRDFAQI 
REGNC INGDVAEKALAMLL I DDWGLNEI DI KLLTT 1 1 DYYQGGPVG I KTLSVAVGEDI KT 
LEDVYEPFL ILKGFI KKTPRGRMVTQLAYDHLKRHAKNLLSLGEGQ 

CPn_0391 439701 439510 

No robust homolog present in Genebank/EMBL as of i 1/7/98 

KDOLYKQEKPIPKATILSRNLEVMLDMPKGKRQTLFLGRTSGRSALYSYSRRILVLLNAF 

MRGP 

CPn_0392 439814 440383 

dcd-dCTP Deaminase 

MS IKEDKWI REMALNADMIHPFVNGQVNVNEETGEKLISYGLSSYGYDLRLSREFKVFTN 
VYNSWDPKCFTEDIFISITDDVCIVPPNSFALARSVEYFRIPRNVLTMCIGKSTYARCG 
IIVNVTPFEPEWEGHVTIEISNTTPLPAKIYANEGIAQVLFFESSTTCEVSYADRKGKYQ 
KQQGITVPCV 

CPn_0393 440229 440723 

CT03B hypothetical protein 

KFLTLRHCORKFTLMKGLPRSYSLSL'/PPARFLMOTEKESIKSNKASPYLVSKVSVRKKN 
WGFRLLEEVMI K5WWV t FS I L [GGFVr*DRAIQELRTEELRLQ5KVoSLCQDIL5AQEKQR 
0LOLHLOHW0DSAA I EAAL IORLGLI PKGYKKt .OVSPKCOS ENKD 

CPn_0i'j4 440727 4 4 H 

C lyC - r :hll Dom.iin pror.t.- in IHf.-ri'ily:: in IfomoLntj) 

ketm r ptm[,mff i [CFTr/:r:GF uzL::^ [ alf,';i.i 'vcu I :;hykr::kl;kkoorvatlllhph 

HLL rTL IFCDICLN I A [QNCFA I LF^AA: ;WWI-TV< II .1T.A ITI.IUIET LPKAVALPFNTQ 

i a:.::jvapli [^:vtk ifkpllhwg [v; rrr^wowt u:k<.*j :d r lyi-OELKnvi^rKDFGV 

VNOEE^RLLYGYLSLSDC.WKE^<MOP^OUt^^Y[jr(V^^'[.ENLY[ J L^ , ::KOII^:::i^VPICNDN 
LUNLl/; iLTARriLLLHDKPLO.^DDLLI 'LLKK 1*7 YMI'l-:r r::AKMALCUM/V\EDE*n I IMI I 
DKY( J.'J I EGt . ITOEDLFE r VAGE I VL^HjNK | [ ,Y'IT: IAI jV f ri'LKLRKF: IK I I'D INt, 

KrNrNrJiATir^t.[EorGT[pr^;MKi.:;wNNijj\ivt,iJAAi'NRfRRVY i'rki,yd 
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Wn J1VV* AMW 44 3175 

CT257 hypothetical protein 

CNQfTNSALFV ICVN I IC rVLOCFYSMMEMACVSFNR^^W'LTKDHKKAfly INFLIRR 
PYR LFCTVMLCVN I ALQVGS E3SRNC YRALC I TPDYAPFTQ I F I W I FAELLPLT r SRK I 
PEKLALWAPELYYSHYIFTPLIOLIGSLTECLVYLLNIRKEKLNSTLSRDEFOKALETH 
H EEQDFNT [ATN I FSL3ATCADQVCQPLEQVTMLPSSANVKDFCRTIKNTDINF I PVYHK 
ARKNV IC I AHPKDFVNKALDEPL rNNLHS PWF ITAKSKLIRI LKEFRDNRSSVAWLNAS 

ge pig r lslna r fk r lfnttn r ahlkpkt isvi ertfpcnsr ikdlqkeldiqfpqypve 

CPn_0396 444J59 443241 

yhfO-NifS-related protein 

YSM I YLDNNAMT PPERGLLEFUJKTFL I EGTYANPSSVHOLGKKSRQLVLEASHWMOKVL 

sfcgrvlytsgateslnlaiaslpkdshvitsgsehpaileplkhsslsvsylnpeegrc 
vlti eq i eravt pktsai i lgwvnsetgakadiaai ahfaqerqlqfivdatanvgkeri 
vlpscvtmaafsghkfhalsgigallvsf^vki^ 

y i fkyldlhqer i sqe ilthrngfeka i karipdvhi hcadqprannvsaiafpplegev 
lq ialdiegvacgygsacssgatapfkslvsmgvdeeltiatuifsfshijxqedverav 
g 1 1 ekwerlkns 

CPn.0397 445124 444381 

' PP2C phosphatase family 
EHFVDFDYFGLSDIGRVRARNEDFWQVNI>ISQWAIAIX3VGGRI^DIASQEAVTSU4^ 
IDEQQSKI^IGYGDDQYKETLKKIIXEVNCT/WEHG^ 

FHVGDSR I YRI REG ELRRLTEDHSLENQLKNRYGLPKQS DKVYSYRH I LTNVLGSRPYVM 
PDI RNL PCEKEDLYCLCSDGLTNMVPDI D I RD I LNQPATLEERGNALI SLANTRGGDDNA 
TWLVRIQ 

CPn_0398 445518 445700 

No robust homolog present in Genebank/EMBL as of 11/7/98 

I EELPMQI ENSS ILFAEWMKWFIFSVISAPWFLPGCTLI PKEKVTKVPSQLWSESLSQ 

P 

CPn_0399 445759 446523 

CT253 hypothetical protein 

YKLMRVLNG KS LNC ES I DLKS KNF PRAR IFCKI SNUm/TMRKMLVLLAS LGLLS PTLS S 
CTHLGSSGSYHPKLYTSGSKTKGVIAMLPVFHRPGKSLEPLPWNUXSEFTEEISKRFYAS 
EKVFLI KHNAS PQTVSQFYAPI ANRLPET I IEQFLPAEFIVATELLEQKTGKEAGVDSVT 
ASVRVRVFDIRHHKIALIYQEIIECSQPLTTLVNDYHRYGWNSKHFDSTPMGLMHSRLFR 
EWARVEG YVCANY S 

Cgn|0400 446527 447306 

C^254 hypothetical protein 
SKE&SKFILIXSLGVAAIASKNFFIWPAPSGKTPUCI^ 

TAELLSTMTGI SLAFAFLFYLLFLPKD ITRAI LFSG ERPVKTSWRALGSAI RMWI 1 1 1 PV 
TQSrlGIMMSKFLTLVLPTQEIHTQEVTQEVQNSLPITGHYISMII^^ 
GII^FU<NKMrRIAAVl£SSIIFSFIHIEHSLGSWV^ 
SPfALHGLFNLTSLLFLGIK 

CPn^0401 447884 447495 

CE255 hypothetical protein 
MRDjj^SKLIGTVRAMWEGRCPWSLC^SLVSMVEHILG 

AQ.@^TLVLILCFLLEREGVU^EDVANEAMEKLRRRAPYIFAEDYKPVSIEEADRLWEL 
AKftREKNEST 

in 

CPnj0402 449012 447888 

miitY-Adenine Glycosylase 
NRKRFCMTK IAFS EKAKNFPVEALKKWFEKNKRSLPWRDKPT PYSVWVSEVMLQQTRAEV 
V FDYFNQWMERF PT I E SLAAAK EEDV I KLWEGLGYYSRARHLLEGARMVMEEFHGK I PDD j 
AJS.LAQ I RGVGPYTVHAILAFAFKRRAAAVDGNVLRVLSRIFLIETS IDLESTRTWVSR I / 

aqaTlphks PEVIAEALIELGAC ICKKVPCCHRCPVRQACGAWRENKQFVLP 

I F|LfiRLVAI VLYDGSLWEKRRPKEMMAGLYEFPYI EVE PEEGLQDIEGFTKKMELSL 
• PL;EFLGNLKEQRHAFTNHKVHLCPI I FKATSLPQFGELHLLSDIDHLAFSSGHKKIKdZ 
LI£££GDVRSRESIGV 

CPap0403 449009 449710 

yceT-predicted pseudouridine synthetase family 
NF^LSNDKRAALQYFMENFSWLATQVSRLSS FLRSQLPNHS KQEI LAS I RQHRORVNGF 

i er fes ykvqpgdrvs ls l i pstkqq ps i lweddys 1 1 yekpphltteqmahmxrfftvh 
rldkgtsgcli^gkskqaatelmklfkqrkihkqyiafvfghpkkkfgtvksy/apvyrr 
cgavi fgaagpsqgep i ksaykwdcwvi lls emsttdlknslprssalssml'jt 

CPn_0404 450962 449871 

No robust homolog present in Genebank/EMBL as of l//7/98 
ELEALEQKYGKAVLLIALSELGIDTMSLLSGHRLEGFPPIAEVMAACDRQ5MDFCEILKS 
QSMDLWADAASCraLLODPFWSTAIASGIAKSSWETEFECESKVb^SWGECCAQVC 
S PFNLER I CMS F PS LKVFSLK KNGC ENMG I QLS ASCMNLLMS I FFVATMGGSTP IWITKE 
NLMALVALVLSHYtXYFVPATGDPQRGNII^NPEVNAIU^GMGMRVDtERKRGGESSSS 
RYLEIJ^CFENSLTKTSLLSDA^QERDKCLLQMSTSLMHTAGLNLQRPPVPTPSGVT 
.\HPOPOPDPWTSOPSLLCARERSPVSSRGRFPVVLPLSVISPRSH/GRVERRDLEDEEE 
EVMF 

CPn_04O5 451814 450966 

CT105 hypothetical protein . 

NrOTSHSRVLLKKFSKEFTrRTYRSLGFTDYLGGCLTNPLGKroSPQNPQVVTrAPSSTT 
POAVSSAVOGFLCr^GGAASSTATTTTASGASALCLSPrX3VOA/LTNLL^MX3PS^^PST 
SAGTSGASSSS ASMQQQLLQL I LDKTTCSCGSSVSSEQLQq/lSLVSQMTTSQGCSGGTC- 
ACQAASVLLNLLSATCSAAANPLGTAASLAQI rYAAVTSPdAKKTSEFCYNYCGETCQGN 
CGC PTCCC PDGQCCCGG PGR FFCGVWKNCCG IGEGSQEP/yf PL 

CPri_04 0h 4Sl'ifSn -1528*? 

t.itjf -Enoyl -Acyt -Carrier Protein Reducr/ise 

iXIF'MLK CDLTTJKVAFVAn IGDDQCYGWG r AKLLAEAGAT I IVGTWVP IYKIF30SWELGK 

™E:;RKt.r;NrrrL.t,EiAKiYPMDAr:Fn:;pF.DVPEDiAjENKRYKC [tci rr I sevaegvkkdf 

'^IDILVIirJIANIIPF.r'IKSLLLT™^^ 

Y I .A: JMRAVIWa * ',Ml ; ^ A K AAL E.'jDT KTLAWEAC r/ WG I RVNT [ S Ai^PLASRAfiKA TGF I 
EltMVDYY'jEWAP [ ! 1 EAMNAGOVf I A VAA F LA.1 P La/a t TG ETL YVDI ICANTVMCt [CPEMFPK 

Ul'> 

MAO :;up<;rt.imi Ly hydt n L.i:;o/pho:;p/ir .i:;^ 



NTYGDAMEKLLVTDr V 7T I Tj^^^M-DKKWER LYALHQA^K LFFLTGR YYK Y AAP LFSD 
FDAPYLLCCONCASVW.l.Tr^^WoKSLjP'DLU: I LC DCM EG AT A L F 3 V £SG A P YG DH Y 

YR FS PT P I AQDLHEYVDPR YfPWaK ER E? Lr ETRSLKDDY AF PS FAAAKVFG LRDEV I R I 
QKELERQEALTSVATHTLMRWPFDFR YA I LFLTDKSVS KCKALDR WN r LYDGKKPFVMA 
SGDDANDLDLIERGDFK I VMSSAPEE^flHVHADFLAPPADKNG ILGAWEAGVRYYDDLMSL 

CPn_0408 454090 / 454581 

CT102 hypothetical prote/n 

: ' p vu f m r n ; : *: .*. ■ ; .a : ./rr~ r c f. * r " '. 1 '. r ^ * .■ ■ m k r ,vy r " v? r~ : .7 : fg 
k< I'ywAKi-'-'!"/-. :;■':'(" :=■'■ \f", Ly.rr.::.:"\-r"".". v:, : nvvnk":" ,v :*:t:'"k ;t:,lt 

EFPPDTUiNKLUjLrJimO^^^^ATLlDKLJKAIv'K lvKKCv. v l"NClv 

CPn.0409 454645 455127 

CT260 hypothetical protein 
MTTWTLNQNNLTKFLKSSDjtEPFLERESGL.^ 

LPYQLHESHKASTARLLHLOTRD 1 0 1 PGFGMDEEQGL r F YRLVLPCLNGE I HDTLLR I Y I 
DTIKLVCDSFSHAIGLIS£GNMT4LDELRROALQEQQEKRWE 

CPn_0410 / 455087 455333 

dnaQ-DNA Pol Ily Epsilon Chain 

DVRLFKSNKKNVMSSOTMDVLI FYDTETTGTQ I ERDRI I EIAAYNSVTDESFLTYVNPE I 
PI PDEASKIHG ITTMVLSAPKF PEAYEGFRKFCGEDS I LVAHNNDGFDFPLLGKECRRH 
SLEPLTNRT I DSLMWAQKYRPDL PKHNUJYLRQVYG FAENOAHRALDDWI LHKVFTSLI 
GDLPPQQV^DLLCCSYHPKVFKMPFGKYKGCPLVDIPKSYFEWLENOGALDKPENKDIKA 
AIALLHQPT 

CPn_0411 I 455794 456609 

CT262 hypotihetical protein 

RHQSRYSS rTCTDMILTAAFSPCPND I FLFRSFLKDPQFRPLLNQVTI ADI ETLNTLALQ 
RRI^LWKMSAALFPLVSDYYNLMDVTJOTLGYNSGPIVL^ 

l^KLYYPKAKLIPMPYDKILSAILC^KVDGGALIHEERFSYDLQLTLRADFGELWRRKTI 
F PLPLGCIW I AKYVPMATVDALT AALRKS L ICSLKDP IT AG AKAVEY S KNKNVTV I HRF I 
GTYI^E^FQLSKTGKKALHMLWKANECCQYT 

CPn_04'12 456515 457246 

CT263/ hypothetical protein 

EPISTKKPFNYLKLGKKLY ICSGRPMNAVNT PKK I LC IVADYRE ISPLI EQLDFTQINEH 

LYSYRCTDYHLDLYIVH^STAVl^AIXJSYCOAYTDYDLWINPGFM^ 

TIE^IANLTTOTPPVLSEDPPYIFDALPDSLPKSSLVTSPVLYHYGFHKTFKUXMEGYA 

IA£OAAEHHIPCSFU<ITSDYTVTGDCPFSRLEEVSO?CLTOTLVTXLPEI^ 

LKP 

CPru0413 459209 457227 

/msbA-Transport ATP Binding Protein 
VFMKLLLXAVLRHKNHLVI LGC SLLAI LG LTFSSQME I FSLGM I AKTGPDAFLLPGRKES 
GKLVKVSELSQKDILENWAISKDSETLTVSDATTYIAEHGKSTASLTSKLSKFVRNYID 
VSRFRGLAI FLICVAI FKAVTLFFQRFLGQWAI RVSRDLRQDYFKALQOLPMTFFHDHD 
IGNLSNRVMTDS AS I ALAVNSLMINY IQAP ITFILTLGVCLS I SWKFS I LICVAFP I FIL 
PIWIARKIKNLAKRIQKSQDSFSSVLYDFI^CrvmV^ 

EEKSAAYGLLPRPLIJ4TIASLFFAFVVVIGIYKFAIPPEELIVFCGLLYLIYDPIKKFGD 
\ ENTS IMRGCAAAERFYEVLNH PDLHSQKERE I EFLG LSNT ITFENVSFGYQEDKHI LKNL 
/SFTLHKGEALG IVGPTGSGKTTLVKLLPRLYEVSQGKIL I DSLP ITEYNKGSLRNH I ACV 
f LQNPFLFYITrvWh^TCGKDMEEEAVLEAIJCRAYADEFIL^ 

QQQRLAI ARALLKNAS IL I LDEATS ALDA ISENYIKNI IGELKGCCTQ 1 1 1 AHKLTTLEH 

VDRVLYIEIIGQKIAEGTKEELLGTCPEFLKMWELSGTKEYNRVFVPDHKLVANPTDMAIT 

T 

CPn_0414 460203 459172 

accA-AcCoA Carboxylase/Transferase Alpha 

LCLRIVCIKMILFIRGEHILMELLPHEKOWEYEKAIAEFKEKNKKNSLLSSSEIOKLEK 
RLDKLKEKIYSDLTPWEBVQICRHPSRPRTVNYIEGMCEEFVELCGDRTFRDDPAVVGGF 
VKIC^RFVLIGQEKGCDTASRLHRNFGMLCPEGFRKALRI>GKIJVEKFGLPVVFLV^ 
AYPGLTAEERGOGWAIAKNLFELSRLATPVI IWIGEGCSGGALGMAVGDSVAMLEHSYY 
SV I S PEGCAS ILWKDPKKNS EAASMLKMHG ENLKQFG 1 1 DTV IKEP IGGAHH DPALVYSN 
VREF I IQEWLRLKDLAI EELLEKRYEKFRS IGLYETTS ESG PEA 

CPn_0415 461522 460221 

CT266 hypothetical protein 

SQTGFL PGLTL I FVI 1 1 VWCNAFL I KLCV I MG LQSRLQHC I EVSQN SNF DSQVKQF I YAC 
QDKTLRQSVLK I FRYH PLLK I H D I ARA VYLLMALE EG EDLG L S FLNVQQ Y PSGAVELFSC 
GG F PWKGL PYPAEHAE FGLLLLQI AEFY E ESQAYVS KMS H FQQALF DHQGSVF PS LWSQE 
NS RLLK EKTTLSO S FL FQ LGMQ IHPEYSLED P ALG FWMQ RTR S S S A FVAASGCQS S LG A Y 
SSGDVGVIAYGPCSGDISDCYYFGCCGIAKEFVCQKSHQTTEISFLTSTGKPHPRNTGFS 
YLRDSYVHLPIRCK ITISDKQYRVHAALAEATSAMTFS I FCKGKNCQWDGPRLRSCSLD 
SYKGPGNDIMI LGENDAIN I VSASPYMEI FALGGKEKFWMADFLIN I PYKEEGVML IFEK 
KVTSEKGRFFTKMN 

CPn_0416 461871 461557 

himD/ihf A- Integration Host Factor Alpha 

EALSNMATMTKKKLI ST I SQDHKIHPtlHVRTV IONFLDK.VTDALVKGDRLEFRDFGVLQV 
VERK PK VG RN PKNAAV PIH t PARRAVK FT PGKRMKRL IETPNKHS 

CPn_0417 463047 462244 

amiA-N-Acetylmuramoy 1 Alanine Amidase 

REKGMKLTKYLNTKQLRSM ISR LFVR YSLPMS KQLS FFALCVLGSH P I FACT PNPPQRVR 
R.«JEVI F IDPGHGGKDCCTASKELHYEEKSLTLSLALTV03YLKRMGYKP0LTRS3DVYVD 
LGKRVALSNRCQGDVF I S I HCNHSSNAAAFGTEVYFYNGK'/GSPTRMRMSEVLGKN I LAA 
MEKNG ILKSRGLKTANFWIRDTSMPAVLVETGFLSNSP.ERAALODARYRMHyAKG TAEG 
VH N FIT/: PS FQK PKON I AK I R K PQ I QAN 

OPriJl^lH 1 <li»44l)l 462^53 

mitrE M -Ai.t^ry limn .imoy In Lnny I'jlut jmy I DAP Liq.i::< : 
MOI..KKI..I,MGVOAK I Yt : K VR P L E VRN LTR D.S RC VS VG D T F r AJ i KGQP. Y [X'.Ub V A VHA LANG 
A r A t A:;.*:LYNPKr.::WO [ t'ITNr.EELEAEL^AKYYEYPr;"KLHTI' , ;'/ , PfJTN* r ;KT'rVTCL r 
KAI.Uj::Y'JKT::< IT n-M [ U^K;VIKDGFTTPTPALLOKYLA , W/l<(JNRr)A , / , /MEV.':S [ 

' ;[ ,A: :< ^vaytnh rrAvi ,tn in .umi^dfmgtfetwaaka.klf^lvi-p:;! ;mw nrrosPYA 
::i.x: [ f.::akai'v itw ; r K::AAOYr'ATDr9L:^r;cTKYTLr/GD0K i A' f if ;kynv^nl 

I^At^M'VMA::! J<t:iM.K|)|,[M-:KU;i/;Of j rn;RLDPVLMGPCPWTOYAMUT[)AIJ>NV't:rGL 
l!KI.!.l'[-/y;i<LIWK. -a -x;p|(|j|^-:KHKU>lAOWERYGFAr/T:;DNt-[^^Knr'KI J .[VNr.I(:Dr; 
l-'Y: 1KNYF t F. 1 1 jKKOA I TYAf : I A-'IDI-^ I VL [ Af JKGHEAYO I FKHVI'VAE-' WiKVI'Vi *KVt.A 
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CPn_04l , .» 4f>4B7*> 
pbpJ- troncglycolase/transpeptidase 
QLFPNTN r WN I PQKKVSVFY PMSYRK R3TL I VLCVF ALYALLVLRYYK IQ I C ECDHWAAE 
ALCOH E FCVR D P FR RGT F FANTTVRKCDK DLQQ P FAVDITKFHLCADPLA IPECHRDEII 
QG I LGF lEGQTYDDLnLKLDKKSRYCKLYPLLDVSVHDRLSLWWKGYATKHRLPTNALFF 
[TDYQRSYPFCKLLGQVLHTLREIKDEKTGKAFPTGGMEAYFT4HILEGDVGERKLLRSPL. 
NR LDTNRV I KLPKDGSD tYLT TNPVTOT I AEEELERGVLEAKAQGCRLI LMNSQTGEILA 

i.A>.iV!-KP-ni- , i'rr.*;- , i"('i- , rjrjK;:! j ri-:in , K7::rw::r//F-:ru:WMKri.': , VA :al.O'V:eea:;lk^ 
kk i Kr-i , EF.r[f«.Tr'T:.Kf ipk- :::i'fyL:::i- , n:: r .:.. , ;Mm\ t'VK: :: ;rr-/WAGE.ArR [ :o"E/"; 

VAWYQQKLLAI/3FGRKTGIELPSEASCLVPSPHRFHIhK3SLEWSl*STPY3LAMGYNILAT 
GIOWQAYAILJWGGYAVRPTLVKKIVSASGEEYHLPTKEKTRLFSEEITREVVRAMRFT 
TLPGGSGFRASPKHHSSAGKTGTTEKMIHGKYDKRRHIASFIGFTPVESSEGNFPPLVML 
VS I DOPEYGLRADGTKNYMGGRCAAPIFSRVADRTLLYLGILPDKKLRWCDEEAAALKRL 
YEEWNRSPKQGGTR 

CPn_0420 467120 466824 

CT271 hypothetical protein 

KSFPMNKSRFLRJXCCLCFCGSLFYFTIf^QNSLTKLRLEIPCLSVRLRQLEQQNrSLRF 
L I DKIERPDHLMEI AALPEYQYLEYPSEES ISLLSYELP 

CPn_0421 468007 467108 

yabC-PBP2B Family methyltransf erase 

EI LMSERAH I PVLVEECLALFAQRPPQTFRDVTLGAGGHAYAFLEAYPSLTCYDGSDRDL 
QAIAIAEKRLETFQDRVSFSHASFEDLANQPTPJU-YDGVLiADLGVSSMQLDTLSRGFSFQ 
GEKEELDMRMDGTQELSASDVLNSLJCEEELGRIFREYGEEPQWKSAAKAVVHFTIKHKKIL 
S IQDVKEALLGVF PHYRFHRK I H PLT L I FQALRVYVNGEDRQLKSLLTS A ISWLAPQGRL 
VIISFCSSEDRPVKWFFKEAEASGLGKVITKKVIQPTYQEVRiWPRSRSAKLRCFEKASQ 

CPn_0422 468233 468784 

CT273 hypothec ical protein 

GIJ^MVEIFTiYSTSIYEQHAS^^IVSDFRKEIQMEGISIRDVAKHAOILDMNPKPSALTS 
LU7TNQKSHWACFSPPNNFYKQRTSTPYIJtf^ 

RSQITPFLSYKDKEEEEDEDPEEDDDDPRVC^KVLUCAI^LGVKSTNVMIDYVISRIFQ 
FVQG 

CPn_0423 468788 469216 

CT274 hypothetical protein 

CMLDNTWKAILGWGDDELEE^ISG'/SFLRCCHYSKAILFFEALVILDPLSIYDHOTLGG 
LYLQ IGENSOALAVLDQALRMOGDHLPTLLNKTKALFCLGRI EEATAI ATYLSSCP I PAI 
ANDAEALLMS Y S KATKKNAALVR 

CPrg=p424 469528 470961 

dnaW-Replication Initiation Factor 

SRCgEI FS PSLMGWVDC IWESF INKESGMLTCNECTTWEQFLNYVKTRCSKTAFENWISP 
IQVI^ETOEKIRLEVPNIFVQNYLLDNYKRDt^SFWlJJVHGEPALEFVVAEHKKPS 
ASj^SMEGISEVFEETKDFELKIJ^SYRFDNFIEGPSNQFVKSAAVGIAGKPGRSYNPL 
F I HffiVGLGKTHLLHAVGHYVREHHKNLRI HC I TTEAF INDLVYHLKSKSVDKMKNFYRS 
I^l^VDDIQFUJ^ONFEEEFCOTFETLim^SKQIVITSDKPPSQUO^ERIIARMEWG - 
LV^jfVG I PDLETRVAI LQHKAEQKGLL I PNEMAFYI ADH I YGNVRQLEGAINKLTAYCRL / 
FGJC^TETTVRETLKELFRSPTKQKISVETILKSVATVFQVKI^LKGNSRSKDLVLARQ 
IAm^KTLITDSLVAIGAAFGKTHSTVLYACKTIEHKLQNDETLKROVNLCKNHIVG 

CPtns0425 470965 471564 

hypothetical proteins 
FRGCPMFRRTGKGPFEDVQTLYEEETSSPSSYSPYSRSERPETPPSLFDNPKASEARPLN 
HNLTEESSLPCWSSTPRTESIXPLEEPETTLGEGVTFKGELAFERLLRIDGTFEGILVSK 
GKt £IGPKGWKADIQLQEAI I EGWEGNITVSGKVELRGGAI IKGDIQANTLCVDEGVR k 
ILGYLAIAGITDHSERERDL 

CP4_Q426 472111 471536 

CT|^7 similarity 
MVtFSLLFPKLCYGCQAPGAYFCSNCLEKLLVEDREGRCLHCFRYLGSSETRLCSOC/pS 
SQISQAFSLYLPSOTALSVYARACEGKRPALOFFSKSIAFELASLDETPSCIAYITS'plSR 
KiyyEVAKLEKLLRI PLWPWLPKKRQ IEKLPKGEG ICFLSAYPLSQKWMQTI VGGSASPL 
VSESEFLSQNDQ 

CPr?y427 472153 473715 

nqr2-NADH (Ubiquinone) Dehydrogenase 

avcyvferveastflsitmlkkfinslwklcoodkyqrftpivoaidtfcye/ietpskp 
pfirdsvdvkrwmmlwialfpatfvaiwnsglosivyssgnpvlmeoflhlsgfgsyls 

FVYKEI H I VP I LWEGLKI F I PLLT I S YWGGTCEVLFAWRGHKI AEGLLVt GILYPLTL 
PPTIPYWKAALGIAFGIWSKELFGGTGMNII^PALSGRAFLFFTFPAKMTCDVWVGSNP 
GV I KDSLMKMNS STGKVLI DGFSQSTCLQTLNSTPPSVKRLHVDAI AANMLh I PHVPTQD 
VIHSOFSLWTETHPGWVLDNLTLTQLOTFVTAPVAEGGLGLLPTQFDSAjirAITDVIYGIG 
KFSAGNLFWGNI IGSLCETSTFACLLGAI F LIVTGIASWRTMAAFGIG/fLTGWLFKFIS 
VL I VGONGAWAPARFF I PAYRQLFLGGLAFGLVFMATDPVSS PTMKLOKWI YGFF I GFMT 
IVIRLINPAYP EGVMLA I LLGNVF APL I DY FAVR KYRKRGV 

CPn_0428 473719 474681 

"nqr3 -NADH (Ubiquinone) Oxidoreductase, Gamma/ 
NMSKCSSKHTVR INQTWY I VSF I LGLSLFACVLLST I YYVLS P IjQEQAATFDRNKQKLLA 
AHILDFKGRFQIQEKKEWVPATFDKKTQLLEVATKKVSEVSYP^LELYAERFVRPLLTDA 
QGKVFSFEEKNLNPIEFFEKYQESPPCTOSPLPFWILEm'SWTENMSGADVAKDLSTVQ 
ALIFPI GGFCLWGP IHGYLGVKNDGDTVXGTAWYCOGET PCLJGAN ITNPEWQEQFYGKK I 
FLQDSSCTTNFATTDLGLEWKGSVRTTLGDS PKAL3A I DG/SGATLTCNGVTEAYVOS L 
ACY RQLLINFCNLTHEKKTGE ' 

i:Pn_fMJ'J 474b66 47531^ 

mifl-NADH ( Ubiquinone) Reductase 4 

KRNPPJ^T^KKS'YKoYFFDrLWflNNO I LIAI LG IO"ALA(rPTTVQTA rTMcnAV3 IVTC,C<1 
::i-FV:;rxRKFTPDSVRM[TOLriI:;t.FV*IVIDOFLKA/FFDr::KTL,:lVFVGLr TTNrtVM 
< ;i<.^E:;LARHVTr' [rAFLDf^FAGGUJYiWLLVICVr 7ELFCFOTLM(:FR r TPOFVYA^ET 

t!pr/ -.YotiLi ;i Mvi.,\rr,A[- flu; im iwlvn [ rdskkp/Kr 

'.Tu.'j'liO '17S'i23 -I7t,0'i.l 

n<ir'> NADU f fit i it pi inorie) Rodu.:t.:i:;i> 

l-MWl /;AV*I-WI .MVI A ; 1 LUJAAF ION I LUXNFLo/^Yt^CriTRV.'VrAN^U^ML-VALVLTVT 

■•: i riwrviiAF it jpkantw [.;p^la:;vnu;fl/l i r f rw [ aaftq i i.elllekv:;rnly 

\»:\r,l f-'M'L t AVNi.'A [ U V iVLFG ITRSYPF r I7MM I V.IU IAC JOTWWI .A [ V [ LATTKEKLAY 

r.x> u-yuuj m w.w irrr :r. [amafm.';l'tv. [ [^:;kk;ak iokapletewenttnplke:w 



SKHCPSrSKARTORR.'IL 
CPn J>4 31 47 hW9 /1 7 6 1 S I 

No robust homoiog present In Genetvink/EMBL as of 11/7/98 
K IMTTLFKWPRSRQNPDTLTFLKR/sSVLLHSENS LSYR I FAKVLA I LL73 LAVAFAVT 
LFSCEGSOLRLCALY tC I ALAICVTyLT I WYC I A^K [ ATACKK PPS r SR £ E IV 



CPn_04 32 



47S9.17 



47h5U 



\:MDrv\"I^A^N\-\v;«F/.!-:.'l-:!':r..:A.".. '■ :V::ii\ , fc-.-:.T!A!XV:..':.:Ar.~iV.VV' 
GLFLTGSAPL/jLJ IWI AA;J«-; :7LoMLVCACWKYK LJNALtKTKVAHES 

CPn_0433 47/3:7 476929 

gcsH -Glycine Cleavage System H Protein 
RTFR I LYGTLYRTGSRKVMWYS DYHVW t LPVH ERWRLGLTEKMQKNLGAI LHVDLPSVG 
SLCKEGEVLVILESSKSAI EVLSPVSGEVIDINLDLVDNPQKINEAPEGEGWIAVVRLDQ 
.DWDPSNLSLMDEE 

CPn_0434 / 479471 477276 

CT283 hypotheti/al protein 
RPWVRI YQ0DLFCR1CRDPAWFFSLLSFTLRFYCLGRGWTLLSFFYKHQKKF IG IVIAW 
CVSGIGVGW3RFSRKGSAESTSRRTVFTTASGKRYVEKDFMAMKKFFAHEAYPFTCNPRA 
WNFINEGIXTDYFLTTRVGEKLFLKVYHPGEKIFSKEKAYQPYRRFDAPFISSEEVWKSS 
APQtXEIUCWOgiENPISKEGFLAJlAKLFLEERRFPHYVLROMLEYRROMFALPPDEAL 
SRGKDLRLFGYOTIQI>TCDAYL5AAVELLIRFIDEQKKVLPRPSK0EARDDFYDKAKKA 
YTKI SKNKEFSLGFEEFVNSYFQFLE I S ESEFFNMYRDI LLCKRALLLLCGGVS FDFQPL 
TTFFVQGKDS /qVEFFRLPKEYSFKTKQELKAFEVYLKLVSLPKSDSLDVPNEI LP I AT I 
KAKEPRLVGHRFS IDYKRVALQ DLAATVPMVEVLHWQQNS EH FQEI LQQ FPDVETCQSYK 
DFQHLKPAIjRDK I SLFTRKE ILRARPER I LQS LQQVPKQSQEVLLS AG KNS AL PG I SOGO 
QLAKVIXEMEVLDLYSODAETYYT 1 1 VNSS FEKEEVLPY REVLKRD LASQLLTSHGHLVD 
MERI-ESA^TRYPGEEGASLWORRI>nCVVmiRLGRHLEGSFSWSLDRSLKTFSRGDKEL 
POEFDR^SMKVGDYSSVFMSPNEGPCYYCCLSHLLYDRPASVDKLFLAKSQLDEEIXGS 
YMERFIE 

CPnOfl35 480908 479475 

Phospholipase D super family [uncleavable leader peptide] 

SRLRFRLAALGIFF I LLVPNSVS AKT I VASDKEKVGVLVYBNSVEAFQQ ILDC IDH 
ANE^YVEI^PCWrGGRTLKEMVDHLEARMDLVPELCSYI I IQPTFTDAEDQKLLKALKERH 

FFYVTTGCPPSTSILAPNVIEMHIKLSIIDGKYCILGGTNFE^ 
n/RLFVSGVRR P LAF RDQD I MIJIST A FGLQU^ EEYH KQ F AMWDYY AHHMWF I DNP EQ FAG 
t PPLTLEQAEETVFPG FDKHEDLVLVDSSK I R IVLGGPHDKOPNPVTQ EYLKLIQGARS 
YFIPKDE1XNALVDVSHNHGVHLSLITNGCHELSPAITGPYAWGNRINYFALL 
/YGKRYPLWKKWFCEKLKPYERVS IYEFAIWETQLHKKCMIIDDEIFVIGSYNFGKKSDAF 
DYESIWIESPEVAAKANKVFNKDIGLSIPVSHGDIFSWYFHSVHHTLGHLQLTYMPA 

CPn_0436 481633 480902 

lplA-Lipoace Protein Ligase-Like Protein 

FYVCYMKVRIVDSGKSSAASHHAKDRDLLESLQDGELIL^^ 

FLI^NYADLGl^AAVRPTGGGFWHKGDYAFSVIJ^SATHPSYSSSVLENYHTW 

LEKVFR IQGMLAPEDENSS SRDSGNFCMAKTSKYDVLFGDKK IGGAAQRKVQQGFLHQG S 

LFLSGSSSEFYQRFIJ<PEVLEEIIEQIQIHAFFPLGLEAADEVLQEAJIC^VKEAFIKLFC 

GEGL 

CPn_0437 481810 484350 

clpC-ClpC Protease 
VFMFEKFTNRAKQVIKIJU<KEAQRI 1 NHW 

ARQEVERL IGYGPEIQVYGDPALTGRVKKSFESANEEAS LLEHNYVGTEHLLLG ILHQSD 
SVAWVLENLHIDPREVRKEILRELETFNLQLPPSSSSSSSSSRSNPSSSKSPLGHSLGS 
DKNEKLSALKAYGYDLTEMVTIESKI^PVIGRSSEVERLILILCRRRK^PVLIGEAGVGK 
TA IVEGLAQKI I LNEVPDALRKKRL ITLDLALMI AGTKYRGOFEER I KAVMDEVRKHGN I 
LLF I DELHT I VGAGAAEGA IOASNI LKPALARG EIQC IGATT I DEYRKH I EKDAALERRF 
QKIVVHPPSVDETIEILRGUOCKYEEHHNWITEEAUCAAATLSWYVHGRFLPDKAIDL 
LDEAGARVRVNTM3QPTDLMKLEAEI ENTKLAKEQAIGTQEYEKAAGLRDEEKKLRERLQ 
SMKQEWENHKEEHQVPVDEEAVAQWSLOTGIPSARLTEAESEKLLKLEDTLRRKVIGQN 
DAVTSICRAIRRSRTGIKDPNRPTGSFLFLGPTGVGKSLLAQQIAIEMFGGEDALIQVDM 
SEYMEK FAATKMMGS P PGYVGH EEGGHLTEQVRRRPYCWLF DE IEKAH PD IMDLMLQ I L 
EQGRLTDS FGRKVDFRHAI I IMTSNLGADLIRKSGEIGFGLKSHMDYKVIOEKIEHAMKK 
HLKPEF INRLDESVIFRPLEKESLSE I IHLEINKLDSRLKNYQMALNI PDSVr SFLVTKG 
HSPEMGARPLRRVIEQYLEDPLiAELLLJ<ESCRQEARKLRATLVENRVAFEREEEEQEAAL 
PSPHLES 

CPn_043g 485455 484334 

ycbF-PP-loop superfamily ATPase 

NLTLPMP.P.OVREIMC^TVIVAMSGGVDSSWAYLFKKFTWKVIGLFMKNWEEDSEGGLC 
SSTKDY EDVERVC LQLD I PYYTVS FAK EYRERVFAR FLKEYS LGYTPNPDI LCNRE I KFD 
LI^KKV^EU3GDYLATGHYCRUrrEL0ETQLLRGCDPQKD0SYFLSGTPKSALHNVLFPL 
GEMNKTr/RAIAAOAALPTAEKKDSTGICFIGKRPFKEFLEKFLPNKTGNVIDWDTKEIV 
G0H0GAHYYTIGORRGLDLGCSEKPCYWGKNIEENS I Y IVRGEDHPQLYLRELTARELN 
WFTPPKSGCHCSAKVRYRSPDEACTrDYSSGDEVKVRFSOPVKAVTPGQTIAr/OGDTCL 
GSGVIDVPMIPSEG 

CPn_04;> 485523 486077 

No robust homolog present in Genefcwnk/EMBL as of 11/7/98 

1 1 SSNNP.7LFVSSTLNCVF PSS LPEESADLF ITNKE IVALGEKGNVFLTHS I PMHIAAIT 

r L V I VALAC I A 1 1 C LCCYS05 ILLIAVGIVLTI LTLLCLCALVG FIKFI ROL PQOLHTTV 

QF IREK rftPESSLOLVTNAORKTTODTLKLYEELCDLSOKEFKLQSTLYQKRFELSHKNE 
' KTNQN 

CPn_04 40 48^0^1 ■ AH^IAO 

tin rob':^r. homolog pr,jE»-nr. in Orphan k/EMBL on of 1 1 /7 

LAT [R^r.TIMAT^VAPSPVPRS.'JPL.^HATEVLULPNAY ITQPUP [PAAPWETFRSKLSTKH 

ti j :f altllltu xrv 1 ■. :AOYAr;-nr;Nw iu :< ; r r\u ; rr vr.xr . 1 lallla t plknkqtgtkl 
tuia:;o.M::s ri;:»3FV0RYiiLMF:rr 1 k: :vmli*elttonoektr t lne ieakke:: tonlel 
k I teco: iflaokopkr k:j:.vk:;fmr: : r kh 1 ;knpv t r a- r>; 

f.I'[i_U4'',l 'XXtWi 4H7HJH 

f :Tt)07 r.yporlu.-r i.-.Ml pror...ui 

p.wh i l.v:;.mfkli.kii t AAFA*;nv[.:rrp 1 k r voiwj ; f iJKKAf^nppi'KPi'.'iAovfjyr.KVND 

AKKKKLWriGYROYD JTFLCTL1'' ITKIIllr A,[,V\7V<Y{ 1<)AU lyWK.';:Xl' t:!KTDt J NCLC 

watrjlt::fynyvll^u :aytl:j Lnm.w: ; 1 r t 7 ;i .vni -Kf 1 r fm ivniw w/ ;k yqat 

KKdSA 1 F'j'V TNET* XHOF.KAW[*L'A;V:;YKA'rr>.jt,TIJIf * I Yf'VNF:} IH;YH:'T::Vt 'NIjt^LAY 
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nnnkt;;ktfkt.';affgt;:;avmmf I 




\FfW* IGNDI S I ADDH 



' CPn_0442 4H79W 4RS523 

CT0U6 hypothetical protein 

NTLFOTDMCFKM [CKC^SQLYLfJGIFPERILARKLKNCAKSYPRTALTIEVLVSSVLGAL 
KV t L £ PCASTYAALTLPLRAX-FriAI KTK5CQHLA3YAMAWLLH I LTIAVI IGLVFSLVFI 
PPPWF rSLGL.L^3VTTrr/TLFQ*/HKNLFPPYEPPPSRPHTPPPFADEYVPLISESYFD 

•\ :i_ Jtll -1 -1 '<'''' i ' 

CTQU5 hypothetical protein 

VDSMSQPPINPLGQPQVPAAA3 PSGQPSWKRLKTSSTGLFKRFITIPDKYPKMRYVYDT 
GIIALAAIAILSILLTASGNSLJ^YALAPALALGALGVTLLISDILDSPKAKKIGEAITA 
IWPIIVIAIAAGLIAGAFVASSGTMLVFANPMFVMGLITVGLYFT1SLNKLTLDYFRREH 
LLRMEKKTQETA£PILVTPSADDAKKIAV^KKKDLSASARMEEHEASORQDARHRRIGR£ 
AQGSFFYSSRNPEHRRSFGSLSRFKTKPSDAASTRPASISPPFKDDFQPYHFKDLRSSSF 
GSGASSAFTP IMPASSRSPNFSTGTVLHPEPVYPKGGKEPS I PRVSSSSRRS PRDRQDKQ 
QCOQNQDEE0KQ0SKKK3GKSNQSLKTPPPDGKSTANLSPSNPFSDGYDEREKRKHRKNK 

CPn_0444 490266 494507 

pmp_6- Polymorphic Outer Membrane Protein 

KAFPCRHMKYSLPV^LTSSALVFSLHPLMAANTDLSSSDNYENGSSGSAAFTAKETSDAS 
GTTYTLTSDVS ITNVSAITPADK3CFTNTGGALSFVGADHSLVLCT I ALTHDGAAINNTN 
•TALSFSGFSSLLIDSAPATGTSGGKGAICVTNTEXX3TATFTDNASVTIJ0KOTSEKDGAAV 
SAYS I DLAXTTTAALLDQrrrSTK>GGAIXSTANTTVQGNSGTVTFS SNTATDKGGG IYSK 
EKDSTLDANTGVVTFKSOTAKTGGAWSSDDNLALTG 

GGA I CC YLATATDKTG LAI SQNQ EMS FTSNTTTANGGA I YATKCTLDGNTTLTFDQNTAT 
AGCGGAIYTETEDFSLKGSTGTVTFSTNTAKTCXIALYSKG^^ 

SNSSANQEGCGGAILAFIDSGSVSDKTGLSIANNQEVSLTSNAATVSGGAIYATKCTLTG 

NGS LT FDGNTAGTSGG AI YT ETEDFTLTG STGTVTFSTNTAKTGGALYS KGNNSLSGNTN 

LLFSGNKATGPSNSSANQEGCGGAILSFLESASVSTKKGLWIEDNEWSLSGNTATVSGG 

AIYATKCALHGNTTLTFDGNTAETAGGAIYTETEDFTLTGSTGTVTFST^ 

KGrTTSFTKNKALVFSGNSATATATTTTDQEGCGGAILCNISESDIATKSLTLTENESLSF 

INNTAKRSGGG I YAPKCVI SG S ES INFDGNTAETSGGAI YSKNLS ITANGPVS FTNNSGG 

KGGAIY IADSGELSLEAIDGDITFSGNRATEGTSTPNS IHLGAGAKITKLAAAPGHT IYF 

YDPITMEAPASGGTIEELVINPVVKAIVPPPQPKrGPIASVPWPVAPANPNTGTIVFSS 

GKLPSQDAS I PANTTT I LWQKINLAGGNVVtJCEGATUJVYSFTQQPDSTVFMD 

TTTNNTDGS IDLKNLSVNLDALDGKRMITIAVNSTSGGLJCISGDUCFHNNEGSFYDNPGL 

KANt^PFLDLSSTSGTWLDDFNPIPSSMMPDYGYO^SWTLVPKVGAG^ 

ALGYTPKPELRATLVPNSLWNAYVNIHSIQQEIATAMSDAPSHPGIWIGGIGNAFHQDKQ 

KE^GFRLISRGYIVGGSMTTPQEYTFAVAFSQLFGKSKDYVVSDIKSQVYAGSLCAOSS 

YVMLHSSLRRHVLSKVLPELPGETPLVLHGQVSYGR^H^ 

FAV^'GGSLPVDLJWRYLTSYSPYVKLQWSVNQKG FQEVAADPRIFDASHLVNVS I PMG 
LT^ESAKPPSAUiTt^YAVDAYRDHPHCLTSLTNGTSWSTFATNLSRQAFFAEASGH 
LKtLHGLDCFA5GSCELRSSSRSYNANCGTRYSF 

CPJU0445 494739 497579 

pnfp^7- Polymorphic Oucer Membrane Protein 

FNJfLVSKKCLQMKSSVSWLFFSS I PLFSS LSI VAAEVTLDSSNNSYDGSNGTTFTVFSTT 
DAMGTTYS LLS DVS FQNAGALG I PLASGC FLEAGG DLT FQGNQHALKFAF I NAGS SAGT p 
VAi^AADKNLLFTJDFSPXSIISCPSLLLSPTGO£AIJCS^ 
NGCWJNTWIFLLSGTSQFASFSRIIOAFTGK^^ 

GGftLYSTDNCS I TDNFQ V I F DG NS AWEAAQAQGG A ICCTTTDKTVT LTGNKNLS FTNNT A \ 
LTY^AISGUCVSISAGGPTLFOSNISGSSAGO^GGGAINIASAGELALSATSGDITFNN * 
NQVJTNG3TSTRNAINI IDTAKVTSIRAATGOS IYFYDPITNPGTAASTDTLNLNLADANS 
E I EYGGA IVFSGEXLS PTEKAI AANVTST IRQPAVIJ^GDLVLRDGVTVTFKDLTQSPGS 
RILMIXXTTTLSAKEA^SLNGLAVNLSSLDGTNKAALKTEAADKNI 

FYE^LKSASTYPLiELTTAGANGTITLCALSTLTLOEPETHYGYQGNWQLSWANATSS ~j 
KIG S I NWTRTGY I PS P ERKSNL PLNS LWGNF I D IRS INQL I ETKSSGEP FERELWLSG I A / 
NF|^DSMPTRHGFRHISGGYALGITATTPAEDQLTFAFCQLFARDRNHITGKNHGDTYG / 

asLYfhhteglfdianflwgkatrapwvlseisqiiplsfdakfsylhtdnhmktyytdn/ 
S I ikgswrndafcadlgaslpfvi svpyllkevepfvkvqyi yahqqdfyerhaegrafm 
kselin^ipigvtferdsksekgtydltlwildayrjwpkcqtsliasdanwmayg™ 
iar@sfsvraanhfqvnphmeifgqfafevrsssrnyntnlgskfcf / 

CPn£|446 497602 500415 / 

pmp^pS- Polymorphic Outer Membrane Protein / 
LI LSMK I PLHKLLI S STLVTPI LLS I ATYGADASLS PTDSFDGAGGSTFTPKS TAD 
ANGTr>TYVi_.SGNVY I ND AG KGT ALTGCC FT ETTGDLT FTGKGY S FS FNTVDAG SNA£?AAAS 
TTADKALTFTGFSNLSFIAAPGTTVASGKSTLSSAGAI^LTD^ILFSQNVSNEAi^ING 
GAITTKTLSISG^^ , SSITFTSNSAKKLGGAIYSSAAASISG^^X3QLVFMNNKGE^jGGAL 
GFEASSSITQNSSLFFSGNTATDAAGKGGAIYCEKTGETPTLTISGNKSLTFAENSSVTQ 
GGAICA^GLDLSAAGPTLFSNNRCGNTAAGKGGAIAIADSGSLSLSANO^DITFLGNTLT 
STSAPTSTRNA I YLGSSAK ITNLRAAQGQS I YFYDP IASNTTGASDVLT INQ pZ»SNS PLD 
YSGT I VFSGEKLSADEAKAADNFTS I LKQPLALASGTLALKGNVELDVNGFTfifrEGSTLL 
MQPGTKLKADTEA I SLTKLWDLSALEGNKSVS I ETAGANKT ITLTSPLVFQDSSGNFY E 
S HT I NQAFTQ PLWFTAATAAS D I Y I DALLTS PVQT PEP HYG YOG HWEATWXot STAKSG 

tmw/ttgynpnperrapwpdslwasftdirti^imtsoansiyo^rgiAasgtanff 

HKDKSCTNOAFRHKSYGYIVGGSAEDFSENIFSVAFCQLFGKDKDLFIVEWTSHNYLASL 
YLQHRAFLOGLPMPSFGS TTDMLKDI PLILNA0LSY3YTWJDMDTRYTSWEAQGSWTNN 

sgalel^slalylpkeapffogyfpflkfoavysrqqnfkesgaeara/ddgdlvncsi 
pvg i r l ek i s ed eknn fei5layig dvy r knprs rt slmvsg aswts lqknlarqa flas 

AGSHLTLSPHVELSCEAAYELRGSAHIYNVDCGLRY3F / 

CPn_04 47 500541 503 351 / 

[jmp_'> -Polymorphic Outer Membrane Protein / 
f^KPP[ALYMKS3LilWFLrS3SL^LPL3LNFSAFAAVVEINLGPTMSFSGPGTYTPPAGT 
TN AOGT T YNLTG DV.'i I TN AGS PTALT ASC FKETTGNLS FOGHG YQTLLQN r DAGANCT FT 

NTAAf ikll:;fl;gf:;yl::l [ottnattctga IKSTCACS IQCMYSofr fgqnfsndnggalq 
• r ; :l:;i ..npnltk arnkatokgoaly-tty x: it iwrrLwaASFsiENTAANNGGArYTEAS 

:;K \r/,:\ JKA I ::F r NNSVTATSAT^A lYCC^APKPVLTL.-DNC/ELNFIGrJTArT^GGAI 

mjNL7i.:::a xipti.f knnha [dtaaploga ta [ad:;g:;L3L3aLggd ltfegntwkcas 

"•'VITPI'H:; rN^NTNAK rvOLRASQ/iNT IYFYDP ITTH tTAfUJDAI jgLNGPDLAGNPA 
f< J T r/F'.'A ;EKL: M-VVF.AAKADNLK^T [ijOPLTI jVyJOL-LKr/JVTLVAKSF^Q.SPG.^TLL 

Mi>Afrm.*TAi>;mNNiA/iJWP:XKKTKK^^^ 
v:;wMiiv/i^rLT[/rAiuvANiHiTDUU^ 

•n/rwTK'rJYNF^I'KPRt rrt.VA^tW;:;FVDVn:;[COLyAT/vRn::oKTRGIWCBGtSNFF 

HKo:;TKmK<;n<iij::Ai !yvv("attti.a:jdnl n'AAFCOLfAJKDHUHF t nkmragayaasl 
m..ym.ATiH :iw::i,i ,hyu i :::k::eofvle-*oao tr.y lY^KN/f-MK'rYY'ivAPKGE^^-wYNDfJC 
At .k!,a:;::i ,i i itai .::hi* "i.i-iiAYFrF [kvka::y titoo^F/Epfrrrr.vRSFD.WDLtNVSVp 
if ; i TF'Ki-i': :umkra: ;yi:a'I*vi yvai vyrkniih -nvdiift irrr:;wKrnrrNL3R0AG igra 



G I FYAFSPNLEVT": fL5ME^^^B.^YNADLt'<GK FCF 

CPn_0443 5oWT 5036^8 

•yxjG_Bs_2 Hypothetical Protein 

F I0PSRRE I HEWKC ILLC3SLRMEMMSPF00PE0CH FDWC3 FLRPE3LTRARSDFEEGR 
rVYEOMRWEDAAIP^JL IKKOTEACL I FFTDGEFRRYSWDFDFMWGFHGVDRRRDSNDPE 
rG VYLKDK I SVS KH P F I HW FE FVKT F EKGN AKAKCT I PS PSQFFHEMI FAPNLKNTRKFY 
PTNOEL I DD IVFT/ROVjIODLYAAGCRNLOLDDCAWCRLLD t RAPGW^GVDS HDRLQEI L 

:v rum . w:.- r-.-yi.r.-r.HV'-F'TPY.T^'F'-. -f «■ \ v. ■ : ••'~~r,rnir>r.<.\ 

CG F ASC EG DH RMTEEaOWK K lAFVKEXAKtl i'.vG 

CPn_0449 / 507231 505330 

pmp_10-PMP_10/(Frame-shift with 0451) 

EAYTCFRGGGGISF3^IVO^TTAGNGGAISILAAGECSL5AEAGDITFNGNAIVATTP0 
TTKRNS I DIGSTMC ITNLRA I SGHS I FFYDP ITAOTAADSTDTLNLNKADAGNSTDYSGS 
IVFSGEKI^EDEWT/AD^TSTLKQPVTLTAGNLVLKRGVTLOTKGFT^AGSS^ 
TTLKASTEEVTIrTGLS I PVDS LG EGKKW I AAS AAS KNV ALSG P I LLLDNQGNAYENHD L 
GKTQDFS FVQL6ALGTATTTDVPAVPTVAT PTHYGYO^TWGM 

WrrrCTfLPNPmOGPL^SL^SFSDIOAIO^IERSALTLCSORGFWAAGVANFU)^ 
KKGEKRKYRHKSGG"f A I GGAAQTC S ENL I S FAFCQLFGS DKDFLVAKNHTDTYAGAFY IQ 
HITECSGFITCIiDKLPGSWSHKPLVLEGOLAYSHVSNDLKTKYTAYPE^GSWGNNAFN 
MMLGASSHS^PEYUiCFDTYAPYIKLNLTYIRCDSFSEKGTEGRSFDDSNLFNLSLPIGV 
KFEKFSDCKDFSYDLTLSYVPDLIRNDPKCTTALVISGASWETYANNl^QALQVRAGS^ 
YAFSPMFm/LGOP/FEVRGSSRIYNVDLGGKFQF 

CPru04?0 508121 507180 

pmp_l OA Polymorphic Outer Membrane Protein 

SG FMKSQFSV^VI^STIACFTSCSWFAATAENIGPSDSFDGSTNTGTY^ 

T LTG Da TLQNLGDSAALT KGC F S DTT ES LS FAG KGY S LS F LN I KS S AEG AAL SVTTDKNL 

SLTG^SLTFXAAPSSVITTPSGKGAVKCGGDLTFDNWTILFKODYCEENGGAISTKNL 

SLKNBTCSISFEGNKSSATGKKGGAICATGTA/DITNNTAPTLFSNNIAEAAGG 

CT ITGNTSLVFSENSVTATAGNGGALSGDADVT I SGNQS VTFSGNQAVANGGAI YAKKLT 

LASjGGGGVSPFLT I 

CBK_0451 508158 511058 

pjnp_10-PMP_10 (Frame-shift with 04 51) 

mtqrvk i ki ldscfvi fnl i ylfcfy idanss lknks itmkts i pwvlvssvlafschlo 

Alaneei^pddsfngnidsgtftpktsattysltgdvffyepgkgtplsdsc 
/lt flgnghs ltfgf I dagthag AAASTTANKNLT FSG FS LLS FDS S PSTTVTTGQGTLS s 
/ AGGVNL EN I RKL WAGNF STADGGA I KGAS FL LTGTSG DALFSNNS S ST KGGAI ATT AG A 
/ RIANhnGYVRFLSlJIASTSGGAIDDEGTS ILSNNKFLYFEGNAAKTTGGAICNTF^^ 
I ELI ISN^LIFASNVAETSGGAIHAKKLALSSGGFTEFLRNNVSSATPKGGAISIDASG 

ELSLSAETGNIT FVRNTLTTTGSTDT PKRNA I N IGSNGK FTELRAAKNHT I F FYDP ITSE 

GTSSD\TCJ<ItWSAGAUJPYQGTIIJSGErrLTADELKVA^ 

KGVTLESTSFSQEAGSLLGMDSGTTLSTTAGS IT ITNLG INVDSLGLKQPVSLTAXGASN 

KVrVSGKI^IDIEGNIYESHMFSHDQLJ^SLIJ<ITVDAI)VI7n^ 

YGFO^WNVNWirDTATNTKEATATWTKTGFVP 

E IGATGMEHKCGFWVSSMTNFLHKTGDENRKG FRHTSGGYV IGGSAHT PKDDLFTFAFCH 
LFAPXIKDCFIAH^SRTYGGTLFFKHSHTLQPQWLRLGRAKFSESAIEKFPREIPLALD 
VQVS FS HSDNRMETHYTSL P ES EG SWSNEC I AGG IG LDL PFVLSNPH PLFKTF I PQMKVE 
MVYVSQNSFFESSSDGRGFS IGRLLNLSI PVGAKFVQGDIGDSYTYDLSGFFVSDVYRNN 
POSTAT LVM S PDSWK I RGGNLS RQAF LLRG SNNYVYNSNC ELFGHY AMELRG S S RNYNVD 
VGTKLRF 

CPn_0452 511304 512860 

pmp_l 2 -Polymorphic Outer Membrane Protein (truncated* 
FNEETIWTILRNFLTCSAIJLALPAAAQVVYIJ1ESDGYNGAINNKSLEPKITCY 
FLDDVR I SNVKHDQEDAGVF INRSGNLFFMGNRCNFTFHNLMT EGFGAA I SNRVGDTTLT 
LSNFSYLAFTS APLL PQGQGA I YS LG SVM I ENS EEVT FCGNY S SWSGAA I YT P Y LLGSKA 
SRPSVNLSGN^YLVTRDWSCGYGGAISTHNLTLTTRGPSCFEMWAYHDVNSNMAIAI 
APGG S I S I SVKSGDL I FKGNTASQDGNT I HNS I H LQSGAQFKNLRAVSESGVYFYDP I SH 
SESHKITDLVINAFEGKETYEGTISFSGLCU3DHEVCAENLTSTII^DVTLAGGTLSLSD 
GVTLQLHSFKQEASSTLTMSPGTTLLCSGDARVQNLH I L I EDTDNFVPVR I RAEDKDALV 
SLEKLKVAFEAYWSVYDFPQFKEAFTIPLLELLGPSFDSLLWEnTI^RTQVTTENDAVR 
GFWSLSWEEYPPSLDKDRRI TPTKKTVFLTWNPE ITST P 

CPn_0453 -513156 516152 

pmp_13 -Polymorphic Outer Membrane Protein 

NCVLLYLFFYSLSLICRI IWFHLYVQMKTS IRKFLISTTLAPCFASTAFTVEVIMPSENF 
DGSSGK I FPYTTLSDPRGTLC I FSGDLY I ANLDNAI SRTSS SCFSNRAG ALQ I LGKGGVF 
SFLNIRSSADGAAISSVITCNPELCPLSFSGFSOMIFDNCESLTSDTSASNVIPHASAIY 
ATT PMLFTNNDS I LFQYNRS AG FGAA I RGTS I T I ENTKK S LLFNGNGS I SNGGALTGSAA 
INLINNSAPVIFST^ATGIYGGAIYLTGGSMLTSGNLSGVLFVNNSSRSGGAIYANGNVT 
FSNNSDLTFQhJNTASPQNSLPAPTPPPTPPAVTPLLGYGGAIFCTPPATPPPTGVSLTIS 
GENSVTFLENI ASECGGALYGKK IS IDSNKST I FLGNTAGKGGAIA I PESGELSLSANOG 
DILFNKNLSITSGTFTRNSIHFGKDAKFATLGATQGYTLYFYDP rTSDDLSAASAAATW 
VNPKASADGAYSGTIVFSGETLTATEAATPAfJATSTLNOKLELEGGTLALRNGATLNVHN 
FTQDEKSW IMDAG7TLATTNGANNTDGAITLNKLV INLDSLDGTKAA\'VNVOSTNGALT 
I SGT LG LVKNSQIV JDNHGMFNKDLQQVP I LELKATSNTnTTDFSLGTNC YQQS P YGYQ 
GTWEFT I DTTTHnTGNWKKTGYLPH PERLAPL I PNSLWANV I DLRAVSQASAADG EDVP 
GKOLS I TG I TNFFHANHTG DARS YRHMGGG YL I NTYTR IT PDAALS LGFGOL FTKS KDYL 
VGHGHSNVYFATn'^NITKSLFGSSRFFSGGTSRVTYSRSNEKVKTS^TKLPKGRCSWSN 
NCWLGELEGNLPITLSSRILNLKQI I PFVKAr/AYATHGGIQENTPEGRIFCHGHLLNVA 
VPVGVRFGKNSHNKPDFYTr tVAYAPDVYRHNPE3CDTTLPINGATWTS TGNNLTRSTLLV 
OASGHTSVNDVLEIFGHCGCDIRRTSROYTLDIGSKLRF 

CPn_04 54 516179 519115 

pmp_M -Po lymc:v x hic Outer Membrane Protein 

GMPL3FK:;r:r;FCLL.^L(;3A^CAFAETRLr/:iir/PPITriOGEEILLT:'DFVC::NFLGA3F 
n. r ;.^F INri.'jriNLnLLwKGLSLTFTSCOAPTN^MYALLSAAETLTFKNFL'S INF'R'NQ.^TGL 
'X^L TYGKD IVFQS IKDLI FTTNftVAYnpA^'/TTSATPAITTVTTGA^ALOITDSI/rVEril 
HQ- I Y FFGNI JVJFCo A I .'J^S FTAWK F I NNTATMSFSI INFTSSGGGV t \<XV.Y:>\ .LFEf INS 
fX I [FTAN:;CW:': KG^/Tr^^^ITYAU^SGGAIGIPTGTFELKNNOGKtTF.SYNirrPNDAG 
A I YAET'.-N r Vf JMCV-ALLLD:WrAARNf XIA ICAKVLN IO/JRCP I EFSRNRAKKO^A I F IGT 

::v;rjr'AKO-P.*.TLT: :.-\:;rgpi AFy;NMt.NTKKJ irnait/eacxje rvsLriAw v ; : :hi;/fy 

DklTH.ir.FIT^I'SNKD IT INAN* yv^iSWFTr.K'JLSETELLLPAffHT I LU "IVK t A.'jGE 
I.K[TWIAWrryi^TAT^;:V.O[/IM/;:;arrL/;LATPTGAPAAVDFTir,KLAFDPF::KLKPD 
KV.'JA'JVfJAt JTKNVTCT' JAIA-I.OKHhVTIJI.YW/nLQ.'JI'VArPt AVFK('A IVrK'lV.FCryX 

rATr<;ir^;YC^;KW::Ymsnri.ijpAPu^;Ff'/;p3P3AriTLYAVWN:UM , LVR:;TYr[.hPP. 
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yr/ lYOAAL^Mtr^TDHTTUJUJFf^LYGKTNANPYDSR^^fcfLLoFFGQFP IVTQKSEA 
L I 'TWKAAYOY^KNHLNTTYLR PDKAPK30CQWHNNSY^^B[EHPFLNWCLLTRPLAQA 

WDL- T. F 1 3 A t: V LGGWQSK FTETGDLQR3 F3RG KCYNVSl^CS SQWFT PFKKAPSTLT I 
KLAYKPD I YRVNPHN [ VTWSNQESTS I SGANLRRHGLFVQ I HDWDLTEOTQAFLNYTF 
DGKNGFTNMRV.'JTGLKnTF 

<; t >n 0455 520)63 519458 

No robust hnmotoq present in Genebank/EMBL as of 11/7/98 

>■■■; .■■*•>■ it!/.' :v^i;Kr:Kr:T.v/-rii!::Rr'riri,'Lrrr.wr:r;L:!:Er < "nEGYriYL^/ 
i.. i ;! --Vf • J,;^^L,R^!^•:rT;^^'r::l■:I.[•R^l^^p^RNr.,o^A:;Rpr!•7;:p.EKEAADAY 

ASCCKWAFDDEHLPWV2SH IAYAEEI REKQEQTMOGSLTEEQIGALLCNTVSTEKNLAf 
ALDAV I KQS VWR FRN PDLF AYEREALEAS VTDALVS YVSNLDM I PYTSSQGI V I EDS S I V 
RTSOEHTLIVNCAAFDKLASQIEFLCPSDVLPISGKDPLISDDEDEELNPKVSSAADSKD 

KT 

CPn_0456 521568 520327 

No robust homolog present in Genebank/EMBL as of 11/7/98 

IPCTFESKRKFUITHCLHGWFSVVRHHFVQAFNFSRPLYSRITHFALGVIKAIPrVGHLV 

MGVWLISHCFERGVSHPGFPSDIAPILKVEKIAGRDHISRIENQLKSLRJCTIEVEDLDK 

VHG0YQENPYAD^SEVLKLDKGVHVSELGKA5SRVRNRITRSYSYAPTPQLDSIAIVG 

I DLVS PEEQENLVRLANEV IQLYPKSKTTLYLLI DFNKEWVGDI SSDKEKQLRSLGLHSE 

VCCL3VLEPC£AEGEDTKHFDLMVGCYGKDSYLREGKILC£AIJ3TSL^ 

SRYRSRLSLPINTEKDKTELYKEISRTHHQLHTLGMGLGAQDSGLLLDRQRLHAPLSQGS 

HCHSYLADLTHEELKILLFSAFVDAKNISKKELREVSLNFANDTSVECGCAFYF 

CPn_0457 523886 522120 

No robust homolog present in Genebank/EMBL as of 11/7/98 

VFLPSRVMASCLSAWFSIVREHFYRAFDFSLPFCARITEFVLGVIKGIPVVGHIIVGIEW 

LVSRYLESn/TKPTFVSDWSLLKTEKVAGRDHIARVVETLKRQRVAVAPEDEDKVHGKI 

PVH PFGG IQPVEVLTLYP EVQDATLGLAF SK I RNRVRQAYLQAPRPKLQKIY I IGNDMNP 

FEVDDFLHLARLCNETQRLYPDAT I SLYLTASGGRNAMDKKNRKLLSDCELNPKIACLDF 

NQGDVVKQATCDCWMVYHGENDC^LNQIQEELEKSGEETPWIHVGQKPL5QStA i roFSPF 

SSLEMKGDKEKALEYSELEKEQLYSRLVYVGERSSVLSLGFGDSRSGIIilDPKRVHAPLS 

EGHYCHSYI^LENPGLQmLAAFIJ*PKEI£STILQPISLNLI^ 

MSRSDRNVVVWCDSVMTTIMCEEPSFQHFIMELECRGYSHFTJIFAFRSNSMCVEERRIL 

NESSOEKAFTMIFCEDSVSGGDIRCLHIASEGMLCGKECYAVDVYTSGCANFMMEEVLTL 

ERESNLW^KHGLWKREVRKQKQELAALDQDESEIYVCNQLTAQQNFACS 

CPn_0458 526344 524236 

No robust homolog present in Genebank/EMBL as of 11/7/98 
YFKCYLKLFVSNFIFFVVMPIPYISSWISTVRQHFVKAFDFSRPFCSRVTNFALGVIKA 
I P#GH IVMGMEWLVSSCVAGI ITRSSFTSDWQIVKTEKALGRDHISRVAEILQRERGT 
ITfENQDKVHGKFPVCPFGRIJCSEETUaJCPGEttE^ 

I RTI S I VG S KLKT PQ D FSQ FVS LANETQRLH P EALVCL YLTG LNR£ SQMC DTTTAEKKQ Y 
LH8SGLDSR I QC KDS KEDD AGS PENP ELW IGYYSREQQHN I DGQY IQQC LGKSADP I PWI 
HVTEETC'KDFYYPPNFTSYSHTRQSTDPTSPPRLPESEGDKDS 
LKEEDAGI*IJ©PDRIYAPLSO£HYCHSYIJU^IENEDL^^ 

F>£iARL PLELDS LFFRLVAGQQ EGRN I VTLAHGT PRPEDLDPDSMN I LTRRLQMSGYSYL 
N 1^ JyXSRKM I VKERQFFGDRS EGKS FTL I LFEDP I S AADFRCLQLAAEGMVAKDLPSV A 
DICASGCSCIQFSEMQSPQAIEYRCWEARVEDEAGEEAREPVIYSQDQLSSMLTTCXJNFV 
FSraAWKQAIWRFRSKGLLTMERKALGEEFLTAIFSYLGSQERNE^ 
SF^LDRMVQVLPAEVPADSGND PTRPVPNPDSNPDS SQNEGS 

CSip0459 527062 526619 

Na-robust homolog present in Genebank/EMBL as of 11/7/98 
STrKIQMHrcLRNVTOTSTNKtJlEEGSVSFREYFRAYMCDKIVAQKNFLFTLDAVIKQAGWR 
SQEKLNLFYVESQALGR E I KVSLEEY IQSMVG I LGSQRTKKS FKFSVDFTPLEQALQERC 
S^SDDEDATATSTATGATASPTDMH EDE 

CE4f|O460 527840 526992 

Na robust homolog present in Genebank/EMBL as of 11/7/98 
VIi2^LNFALEETPSISV0YQEQEKLSPCDHSPEIGKKKRW^IKLESFSTYCSLFMSVKprt 
YKLJNIiGIQNSLSGWLLDPYRVCAPLSSPYSCPSYIXDtiQ^EIJyiSLLSTFLDPKNLTafe 

tf^sinfgnssfgorwseflsrvlhdekekhvawc^aklleeglspealslleeol 
res^syi^ilsvspegvskvqerqiuwdi^rsftvmitdlpi^sedirswlasbri 
lvsWsldaadacasgckvlvyenpnaswaqelenfykqverrr 

CPhr0461 528647 527844 

No robust homolog present in Genebank/EMBL as of 11/7/^8 
IS IVACPS I SSWFTWRQHFVNAFDFTHPVCSRITNFALG I IKAI FVLGHIVMCTIEWLI s 

wiprhtvrhgmftsdvssaikveotrghnclapleaylssuivpisqedlgkvhgrtped 
pfvditpteivqllpdeelstvdealcgvksrltyayrsvekpmiqdlalvg/glrdsad 
l infwlj^gvonhyphtkvklylaiwiadvwdcei seeekgqlralgldpjf ies i slts 
agl psv p evatvdfm i tc ygkdqevqdp 

CPn_0462 531124 529037 

No robust homolog present in Genebank/EMBL as of/ll/7/98 
L I FYLFLNLY IACVRFHFQCWFDPMACY ISIWI STVKQH F I RAFDFTRPLGSRITNFALG 
VIKAIPI LGCW IGVSWLVSTCSARRFGKPAFTSDVAS I VK I EKTRG/NPLAWVEQYLRQ 
LR VRLP EG DLG K I HG KVS R DYVCDRT PQENLNMVP HQY LGELGRAF YG I RNR VTKAYQRV 
T P L EV PC LT LVG FD I L DP EDO VN FVRLANG I QTQ Y PQTQ I KL Y L I SaQK I WNQCDGT I SQ 
EKEOOLRSLGLDAKIKCVSAPALLLQKYLQSENLPSCDLLINYYGKQQSVRDVDSIKSLL 
NL33EHIPAISVTYRPDDPFYSYYFFPGSCGGTAPDQRIPWSEQPHLQTYTTLSNPRCDR 
YAVHLGMEDFAGGVFLDPLRVSAPLSGEYSCPSYLLDLKSEELriCFLLSAFIDPNNSGQG 
NPRPMSINFGNSPLGORWSEFLSRVLHDETEKHVAWCNNPQL^KKSFPSHSLSLLENEL 
EE-SGYSYLNIVSV3QERTCVKERRILSSDPSGRSFTVILTDLPEGSSDIRNL0LASDRIL 
VoSALDAADACASECKILEYEDPEOEWAOQYASFYRNIDRAaDLQRQGIPGEPLGVSAST 
R\ZVLEKDIVFNLNAVTOOAMWKFKKRDLFAVESQALGDDMRflALEGYIGSSLLVEGTIQP 
0VACNVNV3FATLDEAVCAACDSAQDAPSEENNTDD 

'IVttJMhl '312490 531191 

Ho robust. honinLoti present in Geneb^nk^MBL o£ 11/7/98 
I ;: ; F'YEKTROt -LOTPNCRT PR\m [ STVG t P I DETCr IAFVD3MMK0('VGQDAKELYTFLSR 
r.NKffYni^LW^SLKFXUU-'LFDEKMLCAPL^EDIf^c/^YLVDLTOHLKDLILGMFLDPQ 
N I SAf JKLLKV:! I NV^IUSFr.rLOQKDFL^WLRDETfAfVVVVFKtlVLSLPATOVCKLVEE 
t.ri::KDY::VLN[F::CIK;DS::pOr J LFRKEI.EGTSGRYrrvrCALYL/;DTDMR^LOLASERIM 
V.'IHRFLI .VDAYAARCKI .LK t DHTNWR POT F 3 R H A^F AD A V D V:J AG F NS R E F K L I TO ANQG 
1 1 .F. : Jt ;E I .I'LPiJKTf-W.GFI ,AFCDRVTVTRI I F t PMLDAA I KQAVWTHKH F3L I DKECEALD 
-rV::YIJ:YVTN::HRKT:^KGr'F[0KEI/fADr::;r'l.,KEALFPG3DEDVP3TnEDPS 
UljHf-niJU-iU:: 



No robust homoloq pl^^K in Gfeneb-ink/EMfiL ir, -:l 1L/7/99 

3LETRGRFTEICLOLLFFDt0^LKFWLF£FXrrALNLFR [FAPLRNRVTTEYSRARQPDL 
^WW^sISsWePLISY^^^ 

IROTFFSEDAVPESEPFDLGIYVHT^ EYPQSEFLLM 
RPRML3 



CPn_04 65 
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ALLt'I I'JJLYVGLLWLLJR if'KL 
RDEGKVHGDLPSAPFF 

CPn 0466 533/18 536537 

pmp~15 -Polymorphic O/ter Membrane Protein 
TSMRFFCFGML^PFTFVUVNEGLOLPLEra 

DF I LDYKYYRSNGGALTCKNLL I S EN IGNVFFEKNVC PNSGGAIY AAQNCT I S KNQNYAF 
TTm.VSDNPTATAGSLLGOALFAINCS ITNNUKX3TFVDNLAI1JKGGALYTETNLS IKDN 
KGPI I IKQNRALNSDSLOGGI YSGNSLNI EGNSGAIQITSNSSGSGGGIFSTQTLTISSN 
KKLIEI SENSAF ANNYGSNFNPGGGG LTTT FCT I LNNREGVLFNNNQSQSNGGAI HAKS I 
1 1 KENGPVY FLNNTATRGGALLNLS AGSGNGS F I LSADNGDI I FNNNT AS KK ALNP PYRN 
AIHSTPNMNLOIGARP^RVLFYDPIEHELPSSFPILFNFETGHTGTVLFSGEHVHQNFT 
DEMNFFSYLRNTSELRCXT^VEDGAGLACYKFFQRGCrrL^LGCG 
PTTVGST IT LNH I aTdLPS I LS FQAQAPK IWI Y PTKTGSTYT EDSNPT I T I SGTLTLRNS 
NNEDPYDSLDLSHZLEKVPLLYIVDVAAQKINSSQLDLSTLNSGEHYGYQG IWSTYWET 
TTITNPTSLLGA^KHKLLYANWSPLGYRPHPERKGEFITNALWQSAYTAIAGLHSLSSW 
DEEKGHAASLC^GLLVHQKDKNGFKGFRSHMT<rYSATTEATSSQSPNFSLGFAOFFSKA 
KEHESONST SSMHYFSGMC I ENTLFKEWI RLSVSLAYMFTS EHTHTMYOGLLEGNSQGS F 
HNHTLAGALSQWLPQPHGESLQIYPFITALAIRGNLAAFQESGDHAREFSLHRPLTDVS 
LPVGI RASWKNHHRVPLVWLTEI SYRSTLYRQDPELHSKLL I SQGTWTTQAT PVTYNALG 
IKVrorTMQy^PKVTLSLDYSADI SSSTLSHYLNVASRMRF 

CPn_0467/ 536528 539434 

prop 16 -Polymorphic Outer Membrane Protein 

NEILTIfe^^lKIK^PLVSKTPPKFLFYI^FTACMFC^PAVYSI^DSLEKFALERI>E 
EFRTS^PLLDSLSTLTGFS PITTFVGNRHNSSQD IVLSNYKS I DNI LLLWTS AGGAVSCN 
NFLI^rTTOHAFFSKNLAIGTGGAIACQGACT ITKNRG PL I FFSNRGLNNASTGGETRGG 
AIAOfeDFT ISQNQCTFYFVWSVNNWGGALSTTCHCRIQSNRAPLLFF 
RS EliTT I SDNTRP I YFKM^CG^^^CGAICr^SVTVAIK^WSGSVIFNNOTALSGSI^^SGNGS 
GGMYTTNLS I DDNPGTILFNNNYC IRDGGAICTQFLT I KNSGHVY FTNNQGNWGG ALML 

DSTCLLFAEC^IAFQhJNEVTLTTFGRYlWIHCTPNSN^LGANKGYTTAFFDPIE^ 
H PTTNP L I FNPNANHQGT ILFSSAYIP EAS DY ENNF I S S SKNTS ELRNG VLS I EDRAGWQ 
V YKFTQKGGILKLGHAASIATTANSETPSTSVGSQVI INNLAINLPSILAKGKAPTLWIR 
iLQSSAPFTEDNNPTITLSGPLTLLNEENRDPYDS I DLS EPLQNI HLLSLSDVTARHINT 
/dnfhpesijjatehygycgiwspywvetitttnnasi ETANTLYRALYANWTPLGYKVNPE 
' YQGDIATTPLWQSFHTMFSLLRSYNRTGDSDI ERPFLE IQG I ADGLFVHQNS I PGAPGFR 
IQSTGYSLOASSETSLHQKISLGFAOFFTRTKEIGSSNKVSAHhnVSSLYVELPWFQEAF 
ATSTVLAYGYGDHHLHSI^PSHQEQAEGTCYSHTLAAAIGCSFPWOQKSYXHI^PFVQAI 
AiRSHQTAFEEIGDNPRKFVSQKPFYNLTLPLGIC^KWSKFHVPTEVfTLELSYQPVLYQ 
QN^IGVTLLASGGSWDILGHNYVRNALGYKVHNQTA^ 
GSTLKF 

^Pn_0468 539608 540432 

''pmp 17 -Polymorphic Outer Membrane Protein 
IYldi£>NKU4I FTDKLYFH IKVWMFMRPIC^ 

SAFHT5 PS FRLNVT PE PLVS S FRPSNLLNG FGHD ITQD I T I TGN5 1 NSV I D Y^fYHY EDGG 
ILACKNIJISENKGNLSFERWSSHSSGGALYSWECWISKNQWSFISNAASLATTT^ 
FGGAI HALDSY ITNNLGEGQFLDNVSKNRGGAIYVGVSLS ITDNLG P I VI KKNQTLEDS S 
FGGGIFCRAVNIERNYQNIQ INDNASGQGWYFLP 

CPn_0469 540399 541460 

pmp_17- Polymorphic Outer Membrane Protein {Frame-shift with 
0469) 

CFRTRGG IFSALGVI ISSNKEI IEISNHSASS INTASGKLYPGGGGIMCTSLVIENNPKG 
L I FNNKTAALSGGAI KTRSFI FQNNGPTAF INNSATSGGALINLSG IGST PQNFFLSADY 
GD I LFNNNT ITSSS PQPGYRNALYAAPG I NLKLGARQGYKI LFYDP I DHDQTTTDP IVFN 
YEPHHLGTVLFSGINVDSNATNPLNFLSKFSNSSRLERGVLAIEDRAAISCKTLSQTGGI 
LRLGNAALI RTKGPGSS INFNA I AINLPS ILOSEASAPKFWI YPTLTGSTYSEDTS ST IT 
LSGPLTFLNDENENPYDSLDLSEPRKDIPPPLPPRCDCKKNRYFESHCRSHELR 

CPn_0470 541357 542532 

pmp_17 -Polymorphic Outer Membrane Protein (Frame-shift with 
0470) 

I S LNLER I S PLLYLLDVTAKK I DTSNL I VEAMNLDEHYGYQG IWSP YWMETTTTTSSTVP 
EQTNTNH RQ L YVDWT P VG Y R PN P ERHG EF I ANTLWQ S A YNAL LG I R I L P PQN LK EH DLEA 
SLCGLGLLINOHNREGRKGFRNHTTGYAATTSAKTAARHSFSLGFAOMFSKTRERQSPST 
TSSHNYFAGLRFDSLLFRDFISTGLSLGYSYGDHHMLCHYTEILKGSSKAFFNNHTLVAS 
LDCTFLPAR ITRTLELQPF I SA I ALRCSQAS FQETGDH I RKFH PKH PLTDLS S P IG FRS E 
WKTSHH I PMLWTTEI S YV PTLY RKNPEMFTTLLISNGTWTTOATPVSYNSVAAK I KNTSQ 
LFSRVTLSLDYSA0VSSSTVGQYLKAE3HCTF 

CPn_0471 542561 545401 

pmp_i 8 -Polymorphic Outer Membrane Protein 

TVQNNRSLSKSSFFVCAL I LGKTT ILLNATPLGDYFDNOANQLTTLFPL I DTLTNMTPYS 
HRATLFGVRDDTNODIVLDHONSIESWFENFSQDGGALSCKSLAITNTKNQILFLNSFAI 
KR AG AMYVNGN F DLS ENHG S 1 1 FSGNLSF PNASNF ADTCTGGAVLCSKNVT I SKNQGTAY 
FINNKAKSSGGAIOAAIINIK DNTG PC L F FNN AAGGT AGG AL FAN AC R I ENN SQ P I Y F LN 
NOSGLGGA I RVHOEl* [ LTKNTGHVI FNNNFAMEADI SANHSSGGAI YC t SC3 1 KDNPGI A 
AFD^^^^^AARDGGAICTO^^LTIQDJGPV , ^ r FT^ING<^^WIXI;AIMLROD^ACTLFAD 
^fNRHFKDTF^JNMVS^^^ICTRN^^]LTVGA^OCII^JATFYDP I LQRYTIQNSrOKFNPNPEHLG 
TILFS3TY I PDTSTSRDDF I:'MFRNH IGLYNCTLALEDRAEWKVYKFDUFGGTLRLGSRA 
VF:;TTni-EO^:::::*Vv^;VtN[NNLAINLf^ [ L/INRVAPKLW I R PTG.^UAPYrJEDNNP I [NL 
:'jnF*f.::[.I.DDEN[.nrYD , rADLAoi , IAEVPLLYl.LDVTAKi[ tNTDNFYPF/'LNTTOMYGYOG 
VWSI'YW F RT [1T:*r^:j:";KDT\TsITLMROLY' IWr^rtjYKVNPENKGn r AL:)AFW0^Ft^ 
ATLRYOTLHji'O rAl*IV\:/'ir^\TR[ J-'VI (fjN.'JNNlJAKOFHMEATCYrJUrn'SN'rAIINHllFOVN 

[0'iT.GK{:Y:;ri , D^L:;r;:n.:H^wp:;RPi l iirrpFroArAVRSNu , rAFOEnr;DKARKFr]vi( 
Ki'i;^NL'rviMino:^WE:;Kn«jTYwriiKu\YUPVi-Y'>jNpEtNV:;L.&^ 

[J\l'NAU\FKuUNOtF[Ri*KL::VFI.LY^;:;V:;::::'rrrilYUIACTTFKF 
CVu_i)4'/> VI"'iH MViHL 
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No robu:;r homolog present in Genefcwnk^^^as of 1 1 H '-[J 
FVFMA^JCrajS.'JGLGKIPPKDNCDRSR^PSPKriELGS^^BfPQEHCEECASGSSHIHS 
5S5FLPEWE:^SSSSAASSPGFFSRVRSGVDRALKSF^^AEST30ARETRQAFVRL 
^KTITADERRDVDSSSAAATEARVAEDASVSGENPSQGVPETSSGPEPORLFSLPSVKKQ 

^RLV^RDRIVLPSGA^ 

ATVTIQQLIQITEFCCCYMEAT05SVSLAEARFKGVETSDEINSLCSELTDPELQELMSD 
nDGW^LDETADDLEAALSHTRLSFSLDDNPTPIDNNPTLISOEEPIYEEIGGAADPQR 
TRE^STRLWNOtR^LVSLLGMILSrLGSILHRLRIARHAAAEAVGRCCTCRGEECTSS 

: , : , rj - ;M -"- 1 r ;;.!■-•■ [■ft- :::rnnvr>rT'r:- \:\ ■•• -n w» iaj .vtwahki f : aktkf..v f.. j..' 
..mV.-V ':v:-p:w:- ; m ):;:;v::f-TVMF.M;M r FY-'/zr-PKn/ ; r VI ,vr: :;;rPw::PAFRI.K EDV 
rcDYi-VftTJAEPSKDKNIYHTPRLATPAIVDLP^RPCSGGSSR^PSGDRVRSSSPNRKG 
VPLPPVPSPAMSEEGSIYEDMSGASGAGESDYEDMSRSPSPRCDLDEPIYANTPEDNPFT 

RAALLSESVS^VW^IAES 

CPn_0473 549602 548070 

No robust homolog present in Genebank/EMBL as oE 11/7/98 

KIMAVGGVGGSRSFSPIPFNRRNSEIHKW 

QLGGTDKIPLPSVKEPGDSQTSGRSGVLQRIWKGWGWKKTP 

GQRLPGLEGFRI)RIQKRSENPEAJ3I^KMKRSYSDGDIJ)RVGHDSNEDSTEDSRSEGGEPS 
SKSSSFLSGVRGAVSKVHGA1XJDIKGKFQRSASEDDLTTG<3EDSAGDTVKEPJISEEAEAS 
SKSSSFLSGVRGATSTVQGALGDAKEKVSAFGEQAAGAIRSAPGNIRTRFQRSSSEGDLS 
WVNKAAKHLRKALEM,EKVAPEQVSPEVASRVQSLI^EQLTHQEPPTVEDLITFVESN 
VGSDSVET/ASIVPQDGSQAPAETAEAPETGGVEGSAAC^AWKALRDFVVSIFQAVASFFR 
AIASRLSSARRESAVDDLASESNTQWFVEQEGVSNPSAAPSLSFAEEIARRAAEMSNRNA 
QSLEKLESGNVTDPVIQQGLGLARSFAPEGQ 

CPn_0474 551600 549807 

CT365 hypothetical protein 

LKIIISISFMSTSPISNDPRYLSLSNATEKTSLLANSRSLSPVPNSLVPSNPEDTGLRKS 

I FTHSVTLFAGLWLLVAVSVVWALTVLAPGVPQAI LLG IAI SGVGIGGFS IMKSLVYM 
VRDYMSPRWQESSRIKSAI^VGTGFTVMGLVMKVGANFVPGGYGGLV*GSLGSSAYSRGSQ 
TTLASFSHY IYTKFFRSEKVAKGEKLTEAETIKEAKKLHYITLSIATIGVGLAVLCILLA 
I AGTVLLGGAPAT I AI I LAPPL I S IGLTTVLQT ILHSS IGKWRAFLLTQ EKKDLFVDT SL 
KD I RLEKL PPSEVEES ETSQSV I EVPDSEG IAETR I SAEEI DTRLSLTTRQKVI FALATL 
LLLAS I AAF IVTGFGGLTVMQVLLVASVGSAVASVTLPMVSSGFSYVAYQLKARLNISKL 
RWKEAKNKKRVRQFLIESGVIASDREFWMWKTVYKKQIQKTDAAIREEVRNFEKGGEVN 
S ALVGG I LLGVGTG IMLLALVPAFAP IVPG I LALGGSTLG I AGS ILMRKFVNWLYDELVK 
LYEPJ^NRREL^YGPESKMRSIATDLVVEAIJ^H^^ 

CPn_0475 553850 551685 

glidl-Glucan Branching Enzyme 

PSH^KLIHPVTOLI^LLVSGRQKDPHKLLGILASEDSSDHWIFRPGAHTVAIELLGELHH 
AVA^SGLFFLSVPKGIGHGDYRVYHQNGLLAHDPYAFPPLWGEIDSFLFHRGTHYRIYE 
RMOTI PMEVQG I SGVL FVLWAPHAQRVSWGDFNFWHGLVNPLRKI SDQG IWELFVPGLG 
EGJI^rtCWIVTQSGWIVKTDPYGKSFDPPPC^TARVADSESYSWSDHRWMEPJlSKQSEG 
PV?IYEVHI£SWQWQEGRPLSYSEMAHRIASYCKE^ 

GY^PTSRYGTLQEFQYFVDYLHKENIGI ILDWVPGHFPVDAFALASFDGEPLYEYTGHS , 

QAL^HWNTFTFDYSRilEVTNFLLGSALFWLDKMH I DGLRVDAVASMLYRDYGREDGEWT/ 

PNi^bGKENLESIEFLKHLNSVIHKEFSGVLTFAEESTAPPGVTKDVDC^ 

CWMHOTFHYFMKDPMYRKYHQKDLTFSLWYAFQESFILPLSHDEVVHGKGSLVNKLPGDTV 

WTRPAQMRVLLS YQ ICLPGKKLLFMGGEFGOYGEWS PDRPLDWELLNHHYHKTLRNCVS A V 

LNAtiYIHQPYLVMQESSQECFHWVDFHDIEI^IAYYRFAGS^SSAU,CVHHFSASTFP\ 

SY^CEGVKHCELLLNTDDESFGGSGKG^^lAPWCQDQGVAV^UDIEXPPIJ^WIYLVT/ 



CPfc_0476 554877 553858 

CT865 hypothetical protein 
GRGfiRADWG DCM I D I MQHF KPYTMVPGQKLP I PG SL LY AQ VF PTLWRLF S S KHE I LNEQT 
LQVQG P LKRF AVFQDLHRGGLAVTS ERYKYYLLPSG ECTQS I KGKLPSAAQAGPLDSLGV 
HKH^QKVRCRRDLKEILPLWFRFAAMAPKGSYRTLETTAIGSLVKTAHQRVLHRETTE 
IA?ALLS IALAGFSECFLPRSYDEEFQGI LPQDGDPEGGVPFELLSYSFGMIQDZFLRHQ 
gqE^ei LPALPPEFPCGRLIHVALPNIjGTLS IVWTKKTIRQVELHAEYSGEVFJSKFCSSL . 
CSARLREWSERRLSGSKRLSLGETLEIKAGTTYLWDCFHK 

CPn|^477 556112 554844 

•y<S&LBs Hypothetical Prptein 
R YMW A EVKGT F KLVC LGC RVNQY EVQ AY RDQ LT I LG YQ EVLDS E I PADlfc I INTCAVTA 
SAis^GRHAVRQLTRQNPTAiilVVTGCI^ESDKEFFASLDRQCTLVSNireKSRLIEKIFS 
YIOTFPEFKIHSFEGKSRAFIKVQDGCNSFCSYCIIPYLRGRSVSRPAEKILAEIAGVVD 
QGYREWIAGINVGDYCDGERSLASLIEQVDRIPGIERIRISSIDPipiTEDLHRAITSS 
RHTCPSSHLVLQSGSNSILKRM^KYSRGDFLKrVEKFRASDPRYA/TTDVIVGFPGESD 
QDFEDTLR 1 1 EDVGF I KVHSFPFSARRRTKAYTFDNQI PNQVI YEBtKKYLAEVAKRVGQK 
EMMKRLG ETTEVLVEKVTGQVATGHS PYF EKVS FPWGTVAI NT//SVRLDRVEEEGL IG 
EIV 

CPn_0478 557640 556210 

h£lX-GTP Binding Protein 
WHGG PLDT I DT PC EQGSQS FGNSLGAR FD LPRKEQDPSOAlAvASYQNKTDSQWEEHLD 
ELISLADSCGISVLETRSWILKTPSASTYINVGKLEEIEEttLKEFPSIGTLI IDEEITPS 

QQRNLEKRLGLWLDRTEL ilei fssraltaeaniovqlaqaryllprlkrlwghlsrok 
sgggsggfvkgegekqieldrrmvrerihklsaolkav/kqraerrkvksrrgiptfali 
gytnsgkstllnlltaadtwedklfatldpktrkcvkpggrhvlltdtvgfrrklphtl 

VAAFKSTLEAAFHEDVLLHVVDASHPLALEHVCTTYDlFQELKIEKPRI itvlnkvdrlp 
OGSIPMKLRLLSPLPVLISAKTGEGIQNLLSLMTEMQEKSLHVTLNFPYTEYGKFTELC 
DAGWAS5RY0EDFLWEAYLPKELQKKFRPFIS\7FPEDCGDDEGRGPVLESSFGD 

i:pn_0479 558434 557616/ 

phnP-Mfitdl Dependent Hydrolase 
A rGMVRD [Q5ES IGKLVFLGTGNPEC I P V P FCfiC R VCQNTG I H R LR S S V L I Q YQNKT L V I 
IW^PDFRTOMLVAGVGELaJVFLTHPHYDHIfiSGrDDLRAlATt-IVTORSLPLVL^AoTYRFL 
NKAKEVLFATPWES^LPAVLEFTILNEDCr/OEEFOfJIPYTYVnYYOKSCHVTGFRFGNL 
AVLTDLC^VDAKIF^YLDWETLELGAGPyETriPFC/IHK^CHLTVEEAKAFANHAGIKN 
I , [ [ T! 1 I i : I IC LEA ER DQIl PEVTF AY DC JM E^LWT L 

» : Pt i _( ) '1 H ' ) 1 > r . *> 1 7 5 / r > H U 5 0 

'."Tint hypor.hfjr. Loll proteiru 
^iJWU^JLUIKUJF.t-VriKKEQK 

PV 1 WDV'jf lAPPpr ; [ LOV[,r«:KQHWT<> ;L.PVH(T IT.l LW AL E PVGKt t A i 'O L£i j AMY ELC S 

(jviiNFu i' ::; rv:;wvFX r,i i facl/vgwiveapli agl:jawv i rc i iogvga i lclfa t l 

MAYI/;p.^[<vrU-WLri[ J :;HEYrTQCl4CRO[nAlir;QN\'rJV[TEYPATCA£.i;OPrTKLPNGSRR 



ESe^SELKL^^SeSL rV^KDRGL0ID0^L0NILKLErL3TTL3LLKKDCVH 

YLUANUKLDNLK IAF J JJVrAW&feLFLiKY'JJVVKVIjHKU LRiivlLSNTSILEN&JoFtA.;. 
LYEmSYLIOWANTUXCVR^ 

DVLS EQ AAVML VHG LAAOG VSFCG LKALMY L.T AV PQ RMWLG AL P LF ES F PVFNRMK EFLG 
ESLGD ' 

CPn 0482 5 / 61764 560961 

artJ-Arginine Periplasmic Binding Protein 
NLAY^OTFMIKQIGRF/FRAFIFIMPLSLTSCESKIDRNRIWIVTGTNATYPPFEYVDAQG 
E\AA3FDIDLAKAIS£I^KQLEVREFAFDALILNLKKHRIDAILAGMSITPSRQKEIALL 
PYYGDE\^EL^SK^LETPVLPLTQYSSVAVC5GTFQEHYLLSQPGICVRSFDSTLEV 
IMEVRYGKS PVAVL^SVGR^/VLKDFPNLVATRLELPPECWVIiGCGLGVAKDRPEEIQTr 
QQAITDLKSEGVI9SLTKKWOLSEVAYE 

CPn_0483 / 561330 564964 
No robust hgfmolog present in Gehebank/EMBL_ as of 11/7/98 
I ILIKKRAIFSRMFPI PPPHCPPNNKNNFYHLTTDTKDPLLLRILRTIGYVLLH I ITLGL 
LLLI HYYKHhSvVRKEGLPTPPTLPKGPE PKT I E IAKQP PKDGEDKKPDVPK PGT P PPED 
TPPPPPKAPSPASPKVPKQPADKKPTPPPEAPPPPVRVATPMPI^PSSCOYWOCLNRMVS 
MVU^PLPXPAMQVDPILGDFNPHFVASYPNRIDNEP^FQIKQFKKIAQNPDLPW 

rijvolslSqalyi^nyylwpgdgncfwyavgwlsalyeessrndivfeqeatrll 
dlpfassspananlcaema£llqlcstycs f i dlydgvi lsqkhtat l i aflrkls aya i 

RCOIAAfiSNEETARALFISDMQDDLXtPSVLEFIJ\ANRPYSELFQ^IDHSALPYMQSRDK 
r f t .TT"i^ T.PflTTT.TnAFLnKMSPEIXX3IJ^KQYEREIR£AFAKLSRKIADSGWOTERFT4AI 
TraiRCQYSRFLATIENRRSGDLPWSPALSFFAFl^TCPSVRFHKLCATFYKSLE 
DII&APPQRSIQEIWISNASLSYLNEDLDSSWQREVISSNIMTILTTHESLTLESSM 
PQI£TLHKRIANLIJQWISTSFETPPLSNQPDLLSNLVNKLLVAIHSKLEI^ 
SLRLTPJ3EGSGI^QEQDLLYTQAVQLLFFILQHPQVNNRPETKDAVKEIJ(ML^ 

YXFKKVENEKKU2KIXRSIL£SLVLKPPAR^ 

QF1JRATFPWQLETEAILLEKEIESTFRNGWNVFLTRLNLFGSKLGSPSSPTALS 
,^FSKSFLIFCFU^PKLU3KKTPLAARIJ3AFQREASHRFTQVKDKJXLSLKYGFPLAT 
' ATINQYSRAi^LICNLLKNTVTASDGFCRSG FRQS L IGYLHSLSSNELGDI LDDVKEQA 
EANrjVAAWTTVPLQPFAVCLIMSDRDTVS EEN I ENFVAMHGFLNT I S PERDAR I FL I RF P 
NHYGCLLPRNPRTEDQNSKPDS SNP 

CPn_0484 564931 565824 

aroG-Deoxyheptonate Aldolase ^^^.^ 

RSELKTGQLKSLVLHEVLI LTFTYPLPRTLKQH PDEVHTVP I S PNLSPG EGS PILI AGPC 
TLESYEHTVSSALTVKEAGAQVFRGS I RKPRTS PFS FQGWEKECVLWHKEAOS IHGLPTE 
' TEVLDVRDVEITAEHVDILRIGAKNMHNTPLLQEVSKSHRPI I LKRS P AATL EEWLCAAE 
YILASSPSCPGVILCERGIRTFEHSTRYTLDLJnVALLKEISSLPVIVDPSHAAGKRSLV 
LPLASAGLSVGATCL^IEVHA^PEKAl^DAKQQITPEELHLFAKKHFCPSESRAHAIS 

CPn_0485 565993 566229 

CT382.1 hypothetical protein 

QP IGRTPTRVFLWRFM I KQ ACKFYLLC^TLLCALYWLLKYCRKLLKGTLHHSEETLYQALL 
SSLI DLLYQLKQLPAPTNE 

CPn_0486 567799 566405 

hypothetical proline permease 
AQHRSLI^NIFHLGCGVLYFMNFSLFLFFLIAIOGICLYVGRRGSKKVEDRESYFLAGR 
SLK I FPLMMTF I ATQ IGGGVLLGAAEEAFCYGYGG I LY PLGVALGL I FLGMGPGKRLAEG 
SLTTWSIFEVFYGSKKLRKIAFLLSAGSLFFILVAQVIALDRLFSSFPFGKYVTVAFWI 
VLASYTSTCGFRGWRTDVIOAGFLLIAVLVCGVSVWLSVPKSLSVLDPFOSLPCAKLSN 
WIFMPMLFMLVEQDMVQRCVAASSPKRLQWAAVGAGLVLLLFNFTPLFLGSLGAKAGLKA 
GCPL I DTIAYFCNPSLAAVMAAA IGVA I LSTADS LMNAVSQL I AE EY PT LKAPYYR YLVL 
GLAVAAPLVAIGFTNIVDVLILSYSLSVCCLSVPVGFYLLAPKGRRVSGAAAWAGVLVGA 
LGYGWVQ IVSLGMFGELLAWVGS LVAFSFVGF I EITWKNKVKTQT 

CPn_0487 569833 568112 

CT384 hypothetical protein 

RRTGGISLTYSSFRWASFRCYSLIFFCFCGSLFGSESLRYQLLIQDFAKVSEEGIGLLES 
KEYSLLQAKLVLRALAQNSSFDDWFRS FKKCQ I SYPELAHDRDVLE EFG IQVLREG I ENP 
SVTVRAVSVT^IGLJ^DFRLVPLLLOSCNDDSArVRSLAI.QVAVNYGSESLKKAIVELAR 
NDDSIHVRITAYQWALLOIEELLPFLRERAENKLVDSVERREAWKACLELSSQFLETGV 
AKDDIDQALFTCEVLRNGMLPETTEIFTELLSVEHPEVQESLLLSALAWSHQLiQNHKEFL 
SKVRHVMCTSPFAKVRF0AA.\LLHLHGDPLGRD3LVEGLRSP0PLVCEAASAALCSLGIH 
GVPLAKEHLESLSSRKAAANL3 1 LLLVSREDI E RAG DV I AR Y LSNPEMCWAIEYF LWDAQ 
WNLRGDTFPLYSDM I KREIGRKL IRLLAVARY3QAKAVTATFLSGQOAQGWS FFSGMFWE 
EGDVKTSEDLVTDACFAAKLEGALASLCQKKDOASLORVSQLYNDSRWODKLAILESVAF 
SENLDAVPFLLDCCHHEAP^LRSAAAGALFS IFK 

CPn_0488 570147 569767 

hitA-HTT Familv Hydrolase 

RKLPTCFAVNVTRSRDHMTVFKQI IDGLIDCEIO/FENENFIAIKDRFPOAPVHLLI IPKK 
P I PRFQDI PGDEMILMAEAGK IVOELAAEFG t ADGYRWINNGAEGGOAVFHLHTHLLGG 
RPLGAIA 

CPn_04yj 57 10"! 7 5700'**. 

(TTIBV hypothor. u:.il protoin 

R I VFA r ENYF::LV*rMEWr.RR T/^MO T PR:UfJTI IDG.CFHADEVTACAr.L I IFDLVDENK I 
[RSRDPWI.r;K{.'E\VCDV<\*\.T.^ [ EMKRFDHHOVr.YDf l.^WMSAC IM I LHYLKEFCtYMDCEE 
YHI-'I.NNTLVlKIVDKOPNf IRFK.-'KEGFC.'iF^DI [ K rYNPREEEETN.'JDADFSCALJlFTIDF 
LCi< LI'.KK Fg YOR V( 'Ki ; t VRF,-\METEDM( .T.VFDP PLAWOENFFFU Y) RKI I PAAFVCF Pflf :D 

nw i LRnr pfNLDRRMiwRvrFPFJiwAf ;m.*;kei/:kv.'>'; r pgavfhik' ;i,Fi.::wrNKKr:c 

ORvNLKLT[ ( OiM<';[ [ 



',7 * \ ; 



...JM'JU ')"LJ7» 
..'l'lHV hypor.h<'C. um L piur.r.in 

iMYN[XHAHi!f;AA::ru;iMA^nit.KK[.:;riiivi:^EVLn':rKPAYl^/;i- , HLl i *.v i 'iUVNL.K:; 

:;iA0l/;vr^VtJJMI.ia.NKA1iKFAl'LH7IJ-M:oi^ j rATAMLEL[.Kl^ , K^FVi:Kt.l''AAlH 

v[<::i<:Y[ ( NRMrnmm'[Y;::[ i i.i.i'F^;KKi.KMF['['i ( i : :[ indrlwflpi t-Mrrn vKi-rnvf; 
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POt LRHTAAD [ LEPTTQESCD t YEFYG3T3EP I ER I Pt^«PYKEH3FFFYRDMLQE 
^E^EVFRVFES 1 PEGEDQAAMF I SKGSELLELSQD^KrI 3PSDERHARE IQKH 
IeDOPCFPFLKAMETDH ITSQOVLF3RYFPSA3LKGMFLSNYSRYYLQHIYFQIPSPTSG 
EFFSNRDRS FLLDLYFAG I3VFWADLESKRLLQY I KRRNKDVGMFVPKHQAEQFAQSYF I 
GIHGSCUAGDYDEFLRELLTCKHTLSQQFTIP 

EL5 I LSCGN L 1 13 LDTTNAYVEAKMSY A I PDLLERQADFHVDLAVFV ^ 
T 3 LKTCK KALVPVFL IGPVDYWK3K ITALYNSNHAVGT I RG3EWVHNCLFCLSSAKAGI A 

: y-.-.- v:.mht!.i i.;r!'t!i'vrF:r*:r/rv 

CPn 0492 574643 574804 

No^obust homolog present in Genebank/EMBL as of linj9B 



CPn_0493 



575142 574855 



No robust homolog present in Genebank/EMBL as of ^^{J _ 

skt^shIktskgf^rf?qwirtftgrgs™ 

RKQEDAETS F I ETPKG I LKK PGNKDPKGKHVHWKDS 
CPn 0494 575370 575146 

No robust homolog present in Genebank/EMBL as of 11/7/98 

VIMIRWPYGSYRGRNPSPraKKI^ 

SLEKKVKGISEAHFK 

CPn_0495 575507 576793 

YE^ISPEEIFISEGAKPDIFRLFSFFGSEKTLGLQDPWPAYRDIAHITGIR^ 
RKETG F I P EL PNQQ SLD I LC LC Y PNN PTGTVLTFQQLQALVNY ANQ HGTVL I FDAAY S AF 

LFA^F^ASLI24QEAGYYGLDLFPTPPAISLYXTNAQKIJCKSLETAGFSVH 
SLKETMVLA 



576751 577812 



PLlM)C SKSC I ETLKDFENLPE I Wl^NAEDS I VKARKI ARSLHTDKNWAIVTLGT 
VM^ETQKPVIYAAVPDRESLTLPKJmTOIYGVNOT^ 

kpIjepf PSDLQKEI VKKLH ASG I EVI EI S ITSSTFKTRI RQAIDKRPSAI ^5^5^imcT^ 
EG^&'LQEILKEKI P I ITDDTSLISEGAC IACSVDYKKSGKQIAKI VHHLLYNNHDVDSL 
RKtlAQ RLS PTTT FNED 1 1 KYLG I KLHKTERNQFLS FKS KKLEKS EKGKNVAVS 

in 

CPn"_04 97 578107 577820 

CT398 hypothetical protein 

IFQRjWLDDSWILEVKVTPKAKENKIVGFDGQAIJCV^VTEPPEKGKA 
LPKRDVTLIAGETSRKKKFLLPNRVQDI I FSLH IDV 

CP^49B 579062 578085 

Nolxaobust homolog present in Genebank/EMBL as of 11/7 /?S ,, T 
YCRLRRAPFM^RKARWWALFAhTTALISVGCCPWSQAKSRCSIDKYIPVVNRLLEVCGL 
PEAENVEDL I ES S S AWVLT P EERFSG ELVS ICQVKDEHAFYNDLSLLHMTQAVPSYS ATY 
DCAWFGGPLPALRQRLDFLVREWQRGVRFKKIVFLCGERGRYQSIEEQEHFFDSRYNPF 
PTEE^ESGNRVTPSSEEEIAKFVWMQMLLPRAWPJ3STSGVRVTFLLAKPEENRWANRK 
OTLLtFRSYQEAFPGRVLFVSSQPFIGLDACRVGQFFKGESYDIJ^PGFAQGVlJCYHWAP 
RrayTLAEWLKETNGCLNISEGCFG 



CPn_0499 



580404 



Lfn u*i3? ioviui 579205, 

No robust homolog present in Genebank/EMBL as of 11/7/98 
L5VYLLIFYFCNCSTMSSVNQSSGTPNPEEVTSPESTEENKNWSSDEAQATHAVALPIV ■ 
TQLSLPEGVGTSSEETASNPRVDEIVAEVSSSRAVADQISSLVERVGELLDDLKGAQSLF/ 
TSFQSELKNCLPAWKSSTRRLETRGAGDNADIARLELFRSDYEAVLGHANOFHGKAHLIL 
SKLTDVHHKLQGLSREDLSLAFDNNDRVLEHLGSLGLDVDAEGNWSLSCERGIPRLVLTJI 
DSMLVQIKKVNLPTVEELRTLQGTTESSSDPRVEESLSCCERLLNELRRLWANFVGFISfi 
CYDNIVFVLMWIVRRINLLPGLGCLPFHNPDASQEDQRSSSGERSTRRERLSP.RSDLSZE 
EMIVRAEGESIHPESPHGDGRNQPSRGDKQDSDSEEETEL / 

CPn_0500 580647 582362 

proS-ProlyL tRNA Synthetase 

QPHSMKTSQLFYKTSKNANKSAAVLSNELLEKAGYLFKVSKGVYTYTPLLWRWSa 
IREELNAIGGQELLLPLLHNAELWQHTGRWEAFTSEGLLYTLKDREGKSHCLAPTjHEEVI 
C S FVAQWLS SKRQLPLHLYQ I ATKFRDEI RPRFGLI RSRELLMEDSYTFSDS PEj&MNEQY 
EKLRSAYSK I FDRLGLAYVIVT ADGGKIGKGK3EEFQVLCSLGEDT ICVSGSYoAN I EAA 
VSIPPOHAYDREFLPVEEVATPGITTIEALANFFSIPLHKILKTLVVKLSYSN£EKFIAI 
GMPGDRQVN LVKVAS KLN ADD I ALAS DEE I ERVLGT EKGF IGPLNC P I DFF AOETTS PMT 
NFVrAGNAKDKHYVNVNWDRDLLPPQYGDFLLAEEGDTCPEWPGHPYRIYQG/IEVAHIFN 
U'.TP.YTDf'FEWJFQDEHGOTQOCWMCTYG IGVGRTLAACVEQLADDRG IVWPKALAPFS I 
TIAFMGGDTVSQELAETIYHELO^CGYErLLDDRDERLGFKLKDSDLIGI^KLILGKSY 
0:;:;G r FEI EflRSGEKYTVSPEAFPTWCQNHLA 

C['nJ) l jl)l S02»:i 5H3650 

hrr.A-HTii Tr.jnnci Lonal Rept c-r.soi 
[ LLTFrjr ;: IF P t MLSVT I VLVOLF. MARilKVSKRD^K I l.D t LFATTFXYUKTGOPVGSKTLK 
:;[jL: TAT niNYFAfclLFAL^LKKNHT.W.R IPTOLALRHYVDlf/KECf-EAEtLJAP I 
VT.K[:JOI^':'E:;i<NUKIM,OKATELUJEILDLPTFFf>:;PPPF.NDS\rr^[0nV/DK0RAVT 

i wvm--< :o f ftdtlwlfkacdtl:: t kr i ekflqny r rklptneel:^keehl."m:;lynev 

VVI'.YI.TKY' 'MF.'JKEDLY'JT* JMSKl.l.KYEAFKDPEVLALCLljLFENyROML'ELLf J LGMHKG 
HATAFl':Kl-:L::urE*rr::H^X;jVlT[PYYmR:;PU;ALC[LGr'[r«.l'YKFALPLLKLFAN 
K[HKTLM , 0:!FYKFKI.::ri>IU 1 [.T:'.NCKL:>NF.PrLRTEYrjr>lKLLPZKF.T!, 



KG I IEYSS IGOKFNPFLHEAVGTEETSp/PEGT I LEEFAKGYKiuERP I RVAKVKVAKAP 
TPKENKE 



CPn_050) 



534225 / 5Sri2l3 



, , .v-;7 ' ".^'irxrri^Tr.tlVAFV'^^FKL'/G: 

da L'oi %a\/TN PFKTLG LiTK f*' r W Kl«3 EV AiJ E i CTV p YTVTSGiiKGDAV F EVDGKQYT P EE 

ltr^feklaasliertkspc]^ 

EPNKGVNPDEVV^ 

TOt^OTFSTAADWPAVTrvyL^ 

i^FHVSAJO)VASGKEQKlfcIEASSGWEDEIQRtt 

^ F^AEKAIKDYKEQ I P £lC£eI EER I E^VRNALKDDAP I EK I KEVTEDLSKHMQK I 
DDK 

CPn 0504 /586418 588514 

hklieefmlkai^ 

livhruSp^dotX^^ 
^y^a^eIsf^efchegfiaaaelp^ 

SVNLLTQKI WS I ATTTEDKPKKI KKT PSKKKGTKKRAS 



CPn_0505/ 



588471 589106 



gpddka^aynyrktqrnw^lkggsayl^ 

DQGKEI&QRRQWRDKPPHIiTNG^ 
ATARI 31 DYAQEYRDVPWRFLLS PEDSG KVLS 

506 589055 589840 

^ LTQPFVGLASEFFEKFSFYTKHRALLKFVLQI ILXFGLFFATVLLGFLTRIMI 
&LLS I YbKI LHR-I P 1 1 KTVYKAAQQVMTT Z FGSKSGSFKQWMVPFPNANVQC *GLVA 
^AP1VCC7GEKEDDPLVTVFI PTTP^ SCGV 
rPMACBfiSPLPDELHQDQGS 

jajftl 589898 590122 

tYPOFPL^II^FNIELFMT^ 
NALSTSDSIFIPKIG 

CPn 0508 590133 590300 

CT421.2 hypothetical protein ™ Tf „ 

SR IMSRHRS YGKSVKGVTKRNVLKRF ERVEVLRKLGRWNDSTAKKVTGLPKTP I LK 

CPn_0509 590299 590808 

(oredicted Metal loenzyme) 

NKFVFLYGNF I RVTQEK I K I HVSNEQTC I P I HLVSVEKLVLTLLEi4IJ<VTrNE I F I YFLE 

DKALAEI^DKWADPSLTOTITLPIDAFGDPAYPHVLGEAFISPQAAL^ 

I Y EE ISRYLVHS ILHMLGYDDTS SEEKRKMRVKENQ ILCMLRKKHALLTA 

CPn_0510 590804 591973 

tlyC-CBS Domains (Hemolysin homolog) „„„^.„„, TTR 

qJWhilijufcillflafgltqpschgsskflct 

TLLCILYGALGTKLYTLLPPKTAHKDLLFWPLYSLSALIAYGFLPPWISTKVPKETTAHL 
RFLASVFQLGLFPl^OLLFYRRRPNQQVRSSTSFQSQLSEALSAFDNLIWEyMIPKVDIF 
ALPEETTLQEALVLVSEEGYSRVPVYKKNLDNITGILLVKDLLLLYTSSHDLSQPISSVA 
KPPFYAPEIKKASSLLQEFROKHRHLAr IVNEYGFTEGI ATMEDI IEEIIGEIADEHDVQ 
EOTPYKKIGSSWIVTXjRMNISDAEEYRILKIDHENSYDTLGGHVFHKVGAVPQKGMRIHH 
ENFDIEIITCTERNVGKLKITPRKRKFNIS 

CPn_0511 5^2141 592488 

msdiokeehgsttifhlhgkIdg^ 

VLLO S Y HQ VGQH SG K I VLTTV P KT I EQT L YVTG F LS Y FK I FNTVD EAIQTLNKDGD 



CPn_05l2 



5^2538. 594412 



SLPLTMRRSVCYVNPSIARAC^ISTWKFLYSLATPLPAGTKCKFDLAGSGK^DWEAP^ 
DLSCTWA/IYAEMPEGEI IEATAIPVKDHPVPQFEFTLPYELQVGETLTIVMGASPNHPQ 
VDDACNGAOLFAORRKPFY'LY I DPTGEGNYDEPDVFSMDIPGNVLKKIEIFTPSYWKNK 
RFDITVRFEDEFCNLTNFi'PEETRlEL^Y EHLRENLNWQLFI PETGFVILPNLYFNEPGI 
YR IOLKNLSTQE I F I SAP I KC FADS A P f I LMWG LLHG ES E R VDS EEN I ETCMR YFRDDRAL 
NFYA^nCFENOENLl-irDrWKLINCTV.^LFNEEDPFrTLCGFOYSGEPHLEGVRHILHTKE 

tksii:;khkeykh i tlaklyk^vnudm r:; t fsftaskehgfdfenfypeferweiynaw 

r.S^rTTA/vLUHPFr [(.V^KO^KDI'Rt M'V f RGLKKItLRFGFVAGGLDDRGI YKDYFDSPQvO 
Y-;iT;r;rAriaJKYTRF.::iV^\I.KAMM(:YArn;PPEVL^FraTGAPMOaELSTGSKPGLNV 
NRI 1 1 ; IGHVAGTALt.KTVF. I IRMG fiV! ,1 lTFFrD. f ;mJLDYEYDDMVPL:JnVTLKDPMGKAPF 

VFYYi.HVTOAi)NAMAW::::r twvrji ,N 

.-PnJlMi '^IM'i '/.'.7'-,i 

Ki • >/A >' '.'ilui't . 

VI' !< -I-AGl'.'Jl-Kf iVKNO L :V*\K K KI Wl'YJj. JTVLyMLER [VHO.'-VHOMTTCI^l'OPPKTSPLYC 
f|>F: ICKF^.T; FY AKP(;IH'KGWI,Y:;U )UI.I/A'l(jtirKTP[TE*/HtVa;f:FPr;CNL«UYYr:DLF 
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TK t K EY DP'.) H 1 1 KALT A [ EY AY L3 DLDNUS t ^^^^IQ^^^^u^uvv^rnnPTMFKN 

rn ^?aSe^/LGKRURK.X0OHA I PLKSLMAVAR lli^NMKALWNYOGT EAALDL 

CPn_05H 50S690 598520 

ASGWDLTKUPFWAL^ 

LLOEYYALCOYRLGEEHYESFEKFREYYGTLYQQARL 
CPn.0515 596450 597181 

LEKQ 

ELVEIiKLVASGE^QFEESSDEEGAF 1 1 DNEPSKTAMLKQRFKSCVRTKLVGSFADES 
LPRGRFTILV 

IPsImaSsI^ 
rVRHWD^RSSL^ 

KJK^KNIDI I PSVMVREDYPSRPG EGYREGLLRMYGGKGAL 
09^0518 600806 

AELEKWQQFYVERSRIRIIEWLRNNXFH^ 

ABjQLLSNKAK I YYSNEALN PR PKRGR P PKQS AKVET ETT * S SD I YT ^J^^ 
if SPSS iTFSEKFDTEEEFLAfnJlGSTRVEDQLNLTNLSERFASLKELSAK] 
DFKGDDDEKWTKTKGSKRGRKKSS 

CPnLo519 601707 600904 

dAfiF-Diaminopimelate Epimerase ™, m -, M ™ B //Tv!fcTv 

hStFSPDGVNVNFVQ I L/3HCQLRVRTYERGVEGETAACGTGALAS ALWSNS YGWKES IQ 
ISIWGGELMTVSQNRGRVYLQGSVTRDL ' 

cMl0520 602233 601646 

H|p%IGGPITGQATDLDIHAREILKTKARIIDVYVEATNQPRDIIEy^IDRDM\»WrANEA 
KDJGLLDGILFSFNDL 

Cp|i0521 603803 602241 

alvA-Serine Hydroxymethy ltransf erase 
KSLLKVFEKF^FAIVEIF^KWAWSLLHKFLENASGK^ 
PSIGERIIDEUCSORSHLKMIASEKYSSLSVQLAMGNIXTD^CEGSPFKRFYSCC 

aidJe^akelfmdcacvqphsgadanllavmailth^ 

EEYTLLKAEMSSCVCLGPSl^SC)GHLTHGNVRLNW 

RLAKEYKPKVLIAGYSSYSRRLNFAVLKQIAEDCGSVL^^FAGLVAGG^EENP 

t P Y AD I VTTTTH KT LRG P RGG LVLAT RE Y EST LNKAC PLnMGG P L P HV I ^^^Y^^^^ 
SVDFKKYAHQVVNNARRLAERFLSHGLRLLTGGTDNH^ID^SLGISGKIAEpiLSSy 
GIA'/NRNSLPSDAIGRWDTSGIRLCTPALTTLGMGIDmEEVADIIVKVLRNIRLSCHVE 
G3 SKKNKGEL PEA I AQEARDRVRNLLLRF PLYPE I DiyEALV 

CPn_0522 603825 604655 

CT433 hypothetical protein , 
REPLSPEKTSL^FKVKNVNQRMIKKNQGKKK^FlfiYIPLKVQKLRQPSFYPKRUfrLYLG 

LfiQKTARKYOAHYLPILTLFPYAKSTPQNKRAL^LPOATHVILTSPSSTHLFLSR 

LSFATLKTKTYLCIGESTKERLLSFL^QVKYW^TQEIAEGIFPLLOALPSSARILYPHS 

'JLARPVIREFLYNRFTFFSYPHYTVKPRKLKK^ILSKYKKIIFTSPSTVRAFAKIFPRFP 

ek™cogrmtlqefokfssokqvslletlgksrtsp 

^pn_052':i 604720 605052 

Ho robust homolon present in rfenebanWEMBL as ^£1 1/7/ 98 
KMAG 5ATPG FDGTA F ULF P PAT RPRYNFWLaLFVT I A I ALVW I AL I ATT I AIGLC I H PLC 
,:F [ FLTA I PLYF ISRYICJHYARNVY IALDWrDHf-KLODMRSHSP I FSDR 

<:Pr, U r ,?A nO r ,l)7? /h0t',L7'i 

IV i rob.i::r homukM pt^wu./tn G^netvink/FMBL ot 11/7/ ^ rw)v , 
K7 : -VKKrr/M';nr'SI<TF'^'0V: 'VL;7i* FRDKEI APKKOFTI AK [:;TLAILA.'JLALL>ALVALi 

!;;LTtvr/:Niwi.AUjrrALF:^ 

7!/ ' Y .' '» N KM' i F Y N N1 1 LN I ' K F K V A [ <JV DA: ' Q P F'l FT F LTG L R V [ KKNQ:'TG 1 1 ™^PTNL 
n/lTArMI.::Ttl^:Tt.KI>K::Vwfc 

ril.'i'JAFKK:: ! [.I'TI-'l * IIIVCi ;FK/i r .F.LFNUOEYYHwALt AYFN^'LKA/\rE;jHA/\IVALPLF 

t; ; 7Y kv i * i ' f. f. 1 1 . i *k fa ;t f ywi ij/jtoafck p al i .da t f jt fpai .i< v r*0RS llv [ lqdp fnt [ e 

:;o::k:;ki: 



CPn 

CT39H nypothet l c^|SP r l [; RV ^ KEH ^ 
■CrQIRDGENRIQEiSE^ 

VNAQENSTAKRRRR 



, tvu -Lv/vl^i k?"" ''KENKMF FM I JTDVCODi IjJKOKcAVDF Fr QAFQPKEAM 

^d^sSlWsaygcgc/civdpqfr 

CPn_05y27 609910 608726 

AOR^^GLONLQKIACTGKGGRVTRQDL£AY I SESQQVS I PE I FQGEVNRI PMS 



TLTFDHRVLDG IYGSEFLTSLKNRLESVTMG 



/CPn_0528 



611165 609921 



^R^^M^YCVGA^ 



CPn_0529 



612298 611165 



KLHKDVXIAVVNGQDPLGGRAFFPKGRIJtoFPL^ IVNGGGKEAGTWKRV^A 
SJf^IASWWT^RIPK^ 

DTLSLLNMIEQIHKNRGN 

CPTU0530 613323 612460 

^.^WR^CGS 

EKDGLTEDWTSEDFSEIALPMLGESDSLNl^TSVAAVAYE^ 
CPn_0531 614198 613245 

JaD^LFPLLSLCSKLLADDASYF^ 
CGEGVGALPSGSFVQWIA 

CPn 0532 614716 614075 

ES FCC KDSWKWGGMFSG I IQELG EVC F F EAQGNGL S LG I KS T PLFVT P ^^^^y^P? 

GKKRBGERVNI EI DMSTKIQVDTVKR I LASSGKD 
CPn 0533 614918 615385 

Sf^iS^ 

kadm i ay ir f acvyrrfkdvgelhevllsatpdmek 



CPn_0534 



615389 615784 



L^FTRS^WPLSDDEIEQFKKRLLEWKAKLSH 

TFDRTISLEVTTKEYELLRQINR7ALEKINESSYGICDVSGEEIPLARLIAIPYATMTVKA 
QEQFEKGLLSGN 

CPn_0535 615763 816296 

kptpiwkLIjsmatrfr 
?TapvfnSS 

IfJNVGDT TFYGH [VPFtSFNYKQWAFPTFNVAD^/LISLGTLLLVYKFYFPTKQTEKKF 

rp ri _O t ' J >; h 16300 817601 

FC" K L F P Y Y ' '.ON PQ E R ET K<X! VH p LKV F F AS A( X"JH I G I Gt r//G T VT AA( . U fO P' j A L FWVW I 
A'\ ^ FC. ^ f VK Y^ KVYl t K FRK LDRDGWOGG PM*^ FL I KAFKT PWrj V I VA t U / M YOV F I 
Y0F3V t TDSLAI H 'WNI .PKVYFMI/JLLFLVFYA I P.GGLOP IGK ICC I VLPf* FMLLYl AL.'jL 
Yi| vKFFHTLrHL.L:TVF^AFKr^A[^FAGOTVATTIHOGtSRAAY:A.Ur^,[uHl:. [ 

l'MVKFF[.rrFI-MVr^YMT[[:;YFt.VnK^:AKFLYGNTGAKIYTLYf J LI J [Ll!.t-^.ll...ONr 
ALL.tM.':V:>;AU.U-|-NLUA*l- ILI'KEVrFPARAASLTETCLSTE 
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(THI4 [ I iy pot her. i oil pr r >t inri 

L T FLLFMDNyLu ML I FCCVLL; ; FGMCT t FVMT IC Fd^MjCNTHRVTTILNFEAKX 
LA P LMU IKK L U XTWLKKRKNRfJSLo ED I DELLDEKKQR^^^wLDQG I KWCAALVL IWKV 

FRNKD 

CPn _nS39 ',18123 419511 

CT814 hvoofhot ioil protein 

TKFTJsr.AOHW^^FriKf-r/rKTKTLRDtWRNNHKPKKTKCKRFRWLRGVLFGGFIATLL 



CPn_0539 618678 621545 

pmp 19 -polymorphic membrane protein 

gwllglrwwqi^wgflflssfcqv^ 

TTYSLRKDF I VCDFAGNS I HKPGAAf LNLKGDLFF I ^ P^TF^MARGAGLFS 
EShlVTFKGLHSLVLENNESWGGVLTTOT 

alffrdnrgtilflknkavnqdeshpgyggavssispgspitfa^^fqe^e^ 

AIYND^AITFENNFOTTSFFSNKASFGGAWSRYCNLYSQWGtn-LFTra^^ 
ADYVH I RDCKGS I VF EENS ATAGGAI AVNAVCDINAQG PVRF INNSALGLNGGAIYMQAT 
GS I LRLHANQGD I EFCGNKVRSQFHS H INSTSNFTNNAIT IQGAPREFSLS ANEGHRICF 
YDPIISATErm^SLYINHQRLLEAGGAVIFSGARLSPEHKKENK^TSIINQFVRiC 
LS I EGGAI LAVRSFYQEGGLLALGPGSKLTTCCKNSEKDKIVITNLGFT4L 
IRAtEKAS I EI3GVPRVYGHTESFYENHEYASKPYTTS I ILSAKKLVTAPSRPEKDIQNL 
1 1 AES EYMG YGYQGSWEFSWSPNDTKEKKT 1 I ASOT PTG EFS LDPKRRG S F I PTTLWST F 
SGLNIASNIVNNNYLWSEVIPWHLCWCXJPVXQIM^^ 

IPFSFNTILSAALTQLFSSSSQQNVADKSHAQILIGTVSLNKSWQALSLRSSFSYTEDSQ 
VMKHVF P YKGTS RGSWRNYGMSGSVGMSY AYPKG IRYLIOITPFVDLQYTKLVQNPFVETG 
YDPRYFSSSEMTNLSLPIGIALEMRFIGSRSSLFLQVSTSYIKDLRRVNPQSSASLVLNH 
YTWDIQGVPLGKEALNITLNSTIKYKr/rAYMGISSTQREGSNLSANAHAGLSLSF 

CPn_0540 621531 626862 

pmp 20-polymorphic membrane protein * 

FI HLIYLSLIEFVNI SDRF SSMKWLPATAVFAAVLPALTAFGDPASVEISTSHTGSGDPT 
SDAALTGFTQSSTCTDGTTYTIVGDITFSTFTNIPVPVVTPDANDSSSNSSKGGSSSSGA 
TSLIRSSNLHSDFDFTKDSVLDLYHLFFPSASNTLNPALLSSSSSGGSSSSSSSSSSGSA 
S AWAADPKGGAAFYSNEANGTLTFTTDSGNPGSLTI£NIJCMTGDGAAI YSKG PLVFTGL 
KNLTFTGNESQKSGGAAYTEGALTTQAIVEAVTFTCOT^ 

KFEKNTSGQAGGG I YTESTLTI SNITKS I EFI SNKASVPAPAPEPTSPAPSSLINSTT ID 
TSTLOTRAASATPAVAPVAAVTPTPI STQETAGNGGAI YAKQGI S I STFKDLTFKSNSAS 
VDATLTVDS ST IGESGGAI FAADS IQ IQQCTGTTLFSGNTANKSGGGI YAVGQVTLEDI A 
NTIji^TNNTC KG EGG A I YTKKALT I NNG A I LTT F SGNT STDNGGA I F AVGG I T LS DL VEVR 
FS&TGNYSAP ITKAASNTAPWSS STTAAS PAVP AAAAAPVTNAAKGGALYSTEGLTV 
SGf&ES I LS F ENNECQNQGGG AYVTKT FQC S DS H RLQ FTSNKAADEGGGLYCGDDVTLTNL 
TGKikiFQENS S EKHGGGLSLASGKSLTMTS LES FCLKANTAKENGGGANVPENI VLTFTY 
T PT^PNE P APVQQ PVYG EAL VTGhTT AT KSGGG I YT KNAAF SNL S SVT FDQNT S S ENGGALL 
TQKAADKTDC SFTY ITNVN I TNNTATGNGGG I AGGKAHFDR I DNLTVQSNQAKKGGGVYL 
EDfelLEKVITGSVSQNTATESGGGIYAKDIQLQALPGSmTDh^ 
G LYSSG AVTLTN I SGTFG ITGNSVINTATSQDADIQGGGI YATTSLS INQCNTPILFSNN 
SAATKKTSTTKQIAGGAI F S AAVTI ENNSQP 1 1 FLNNS AKS EATT AAT AGNKDSCGGAI A 
ANJ^LTNNPEITFKG^AETCGAIGCIDLTIJG 

I YGET I DI S RTGATF IGNS SKHDGSAICCSTALTLAPNSQLI FENNKVTETTATTKASIN 
NL^AIYGNNETSDVT I SLSAENGS I FFKNNLCTATNKYCS IAGNVKFTAI EASAGKAI S 
FYpiiiVNVSTKETNAQELKL 

LSWSFTQS PGTT ITMGPGSVLSNHSKEAGGIAINNVI IDFSEIVPTKDNATVAPPTLKL 
VSRTNADSKDKIDITGTVTLU)PNGNLYQNSYI£m 

GNtiGAKKGYLGTWNLDPNSSGSKI ILKWTFDKYLRWPYI PRDNHFY INS IWGAQNSLVTV 
KQG I; LGNMLNN ARF ED P AFNNFWAS A I G S F LRKEVS RNS DS FTYHG RGYT AAVDAK PRQ E j 
F I ISaAFSQVFGHAES EYHLDNY KHKGSGHSTQASLY AGNI FYFPAI RSRPI LFQGVATY/ 

gywqhdtttyyps i eeknmanwds i awlfdlrfsvdlkepqphstarltfyteaeytrih 
qekrreldydprsfsacsygnlaiptgfsvdgalawrei ilynkvsaaylpvilrnnpr* 
tytostkekgnvvnvlptrnaaraevssqiylgsywtlygtytidasmntlvr""^ 

rfCtf" 

CP0541 627137 628003 

So^ule binding protein { -yebL-Synechocystis Adhesin Homdlog) 
NNR'lSYQTAFVMHKVIVFIFLTLYSLKSYGNDVIDKPHVLVSIAPYKFLVEQIABETCFV 
YA^NHYDPHTYELPPQQIKELRQGDLWFRIGEAFEKTCERNLTCC^WI^QNVSLICXS 
KPCCNQHTTNYOTHTWLSPKNLKVQVCTI-VT^ 

EI LTITSKAKQRH I LVSHGAFGYFCRDYNFSQHTI EKSSHVEPSPKDVARVFHDIEQYKI 
SSVILLEYSGRRSSAMIJUDRFHMHTVNLDPYAEhA^VNLKTIATTFSSL 

CPn_0542 628000 628737 

ABC Transporter ATPase 
FMT I R I LAEGLAFRYGSKG PNI I HDVS FSVYDGDF IG I IG PNGGGKSTLTML I LGLLTPT 
FGSLKT F PS HS AGKQTHSM IGWVPQH FSYDPC F P I SVKDWLSGRLSQuSWHGKYKKKDF 
EAVDHALDLVGLSDHHHHCFAHLSGGQIQRVLLARAIJVSYPEILILDDPTTNIDPDNQQR 
I LS I LKKLNRTCT II.MVTHDLHHTTNYFNKVFYMNKTLTSLADTSTL/IDQFCCHPYKNQE 
FSCSPH 

CPn_0543 628710 629603 

(Metal Transport Protein) 
KSG I FMLS S L I R DS FPLL I LLPTFLAALGAS VAGGVMGT Y I WHR I VS I SGS I SHA I LGG 
I G LTLW I QY K LH LS F F PMYG A I VGA I FLA LC IGK I H LKYQER EDSL I AM IWS VGMA IG 1 1 
F E S RLPTFNGEL I NFLFGN I LWVT PS DLYS LG I FDLLVLG IVvLCHTRFLALC FDERYTA 
LN HC S VQLWY FLLLVLT A IT I VML I YVMGT I LMLSMLVL PVNl ACR FSYKMTR I MF I S VL 
LN t LC5 FSG IC I AYCLDFPVGPT I SLLMGLGYTASLCVKKR/nPSTPS PVS PEINTNV 

CPn_()M4 ti'iOS'18 o29525 

yhbZ- f JTP binding protein 
K:'-'.',VF f '.','<'> I KS^FFCLKKDKNV IMFVDO [TLELRAGKfjjCNtjWAWRKEKYLPKGGPYGGN 
i K IN< VY.'M 1 1 EATT.'3VN*:'FF.AYRN I R FLKAr'Df;0:3GATtiNRTCRnCKDL IVSVPTGTLLRD 
AKT< ! t.HDnVDCEIU,LV:;(^;r ) KrXlKclNTFFKTSVt)RAP'rKATPr;KP(lEIROVELELKL 
f AD If'UAA IFt'NAi IKHTLKNTI .AHTEVK\V.AYPFTT[/r^LGLV[ t CKDRLYQKPWr I ADI P 
• It lK f iAII'.'NKCL' :i.ni-'LRH FERTLLLLFV [DVSKRttPlJSPEEDLCTLIliELIL'JHQPDFEK 
KIMIA/ALriKlDlJl,LlW^?L-:ii<-^ 

i TnJt'M'. t.ifODK <> \l)f.\\J 

i \ I. t i \ x j:;oiim I pi lift: in 
■n-:AM-:i-'i'VMAiiKK':oi ;a::hm' ;rd:jkskklajvw&i":a< ;o«v:;Tt ; ; : i i.v^yufrrRwtiPAQNVG 
ki ;nbi/n .fai iVu ; i vvmkkthhty [:;wrtm 



1 ^^^n 

LMEPYAVIO 



4*3 im J 



CPn_0S4n 

r rl -^RLTLSIEPFtRKKLMEPYAV 
FVFDGTKASLG3PTIANAQVKAEYI^HVKGEKW 

EILI 



CPn_0547 
yabB Eamily 



^1 15S 



532188 



pf:."' 



fha; 7Ja:.::".tnk- si. vr/w'v.: -..?•-< ;:t::.:;c:y:.e 

L)RPKFLCKLJALR(JN lAyVMNLTfTDIGITATSGEGLJD 



EALKli LKPN&K I JHVAITit 
FCCGDGVQCFC/LTVME*fC^ 

CPn_0548 /33234 632191 

cysJ-Sulfite Reductase ^ 
KMYLOEKFKAQQVPLVVRELLSCSDS INDSDP I YRMVFDSNDTT I S YKVGDALGVL PENS 
KEVS EHVLQ LLG Y S PT7TLVNVKKTS EKVS AQK F I QG YVD LDK I P AKLNS FFPDKDPKITL 
YDAIQEYRPQI PIELFAESVFPLLPRFYS I ASSPDLHPKS I ELLVXHVSYPGKYQKRFGV 
CSSFI^SELOVNDSA^IFVQPTKHFTLSTCTEGKPLV^IGAGTGIAPYKAFLEERLFNKD 
PGNNlXFFGERiCENWFTyREFWNHAEEEGKLKLFIJ^SRERDQ^ 

kayeeggfffvcg/kv^ievkhaleeilgkdtlaslrkehryvvdvv' 

CPn_0549 / 633662 633255 

rslO-SlO Rifcosomal Protein 

PODVQHQPVWOTSt^FIJtKFKKRIiRSKGCMKC^KQKIRIRLKGFDC^QLDRSTADIVE 

TAKRTGARVvSp I PLPTKREVYTVLRSPHVDKKSREQFE IRTHKRLVDI LDPTGKT I DAL 

KMLAL P AGVpI K I KAA 

CPn_0550/ 635688 633580 

fusA-Elpngation Factor G _ 
LNYGENNIKFMSNQEFDLS A I RN IG IMAH I DAGICTTTTERI LFYAGRTHK IGEVHEGGATM 
DWMAQEOERG IT ITSAATTVFWLGAK INI IDTPGHVDFT I EVERSLRVLDGAVAVFDAVS 
GVEPQEETVWRQADKYGVPRIAFVNKMDRMGA^ 

OFVGM^LI SQKALYFIXDTLGAKV^EKE ISEDUCERCAEIJIANLLEEI^T I DESNEAFM 
roPDSITEDEIHQVMRKGVIENKINPVLCCTAFKNKGVQQLXNVIVKWLPSPLDRG 
NIRfilNUCTDQEISl^PRWXSPLAALAFKIMTDPYVGRITFIRIYSGTLKKGSAILNSTK 
DKBER I SRLLEMHANERTDRDEFTVGD IGACVGLKF SVTGDTLCDDNQE I VLERI EFPD P 
v t£ma I EPKSKGDREKLAQALS SLSEEDPTFRVSTNEETGQT 1 1 SGMG ELHLDI LRDRM I 
R£FKVEAWGKPQVSYK£TITVSGNSETKYVKQSGGRGOYAWCLEIEPNEPGKGNEWS 
^VGGVIPKEYIPAVIKGIEEGLNTGVIAGYGLVDVKVSIVFGSYHEVDSSEMAFKICGS 
AVKDACRKAKPVILEPIMKVAVITPEDHLCDVIGDLNRRRGKILGQESSRGMAQVNAEV 
/"pLSEMFGYTTSLRSLTSGRATSTMEPAFFAKVPQKIQEEIVKK 

CPn_0551 636174 635698 

rs7-S7 Ribosomal Protein 

^SRRHSAEKRDIPGDPIYCSVILEKFINKVMMHGKKSV^ 

VLEGFGEALENAKPILEWSRJ^VGGATYQVPVEVASERRNCLAMQWI I KHARS KPGKSME 
^VGLATELIDCFNKQGAT IKKREDTHRMAEANKAFAHYKW 

_0552 636698 636219 

yfsl2-S12 Ribosomal Protein 

IQAGYVPSS S ENKPLPTKRALLY I SMLVWRLKREEYMPT I NQL I RKRKKS S LARKKSPA 
LQKCPQKRGVCLQVKTKTPKKPNSAUlKVAWVRLSNGQEVIAYIGGEGHriUJE^ 
GGRVKDLPGVRYH IVRGTLDCAAVKNRKQSRSRYGAKRPK 

CPn 0553 637753 636812 

No robust homolog present in Genebank/EMBL as of 11/7/98 
GCMWRWLRFLI IFILGRAVFPLRASESFSWETSTCLTVXGI PFIDI ILTTNEDFVAQCG 
LOIGTISSTNNAKIKEIFLIYKEKFPEASISFKRKEPU^LSQSHLSDD3ILCMRNGETYA 
EGMANKENGPALKQPKDLRLVLRC PNQPDTLLYSEKE^EKGI ETNTCLCNQGYTLLDGQL 
ILYGDSIEKFLKETKRKNNHTLVDLCDSQVVTTFLGRFWSLLNWQVLFLSEDSAKILAG 
I PDLAQATQLLSHTVPLLF IYTNDS I H 1 1 EQGKESS FTYNQDLTEP I LGFLFGY INRGSM 
EYCFNCAQSSLGET 

CPn_0554 637806 638141 

CT440 hypothetical protein 

VFSYLLLCIILVYWFMYEGKSRMASPTPG^LHLQQKVESKAYDYSRSLAMIATALLFFI 
VALI LSGLSLLPQVFL PFSGAYF I IGSFLAF I ALG I LL INCVCDLKQYLT S S 

CPn_0555 638298 640241 

tsp-Tail -Specific Protease 

MFVMKKLVRLCWLLSLLPNVLFSSDLLREEGIKKMMDKLIEYHVDAQEVSTDILSRSLS 
SYIQSFDPHKSYLSNOEVAVFLQSPETKKRLLKNYKAGNFAIYRNINQLIHESILRARQW 
RNEWVKNPKELVLEASSYQISKQPMQWSKSLDEVKQRORALLLSYLSLHLAGASSSRYEG 
KEEQIAALCLRQIENHENVYLGINDHGVAMDRDEEAYQFHIRVVKALAHSLDAHTAYFSK 
DEALAMR IQLEKGMCG IGWLKEDIDGVWRE 1 1 PGGPAAKSGDLQLGDI I YRVDG KDIE 
HLS FRGVLDCLRGGHGSTWLO I HRGESDHT I ALRR EK I LLEDRRVDVS YEPYGDGVIGK 
VT LHS F YEG EN0VSS EQDLRRA IQGLKEKNLLGLVLD I R ENTGGFLSOA I KVSGLFMTNG 
WWS R Y ADGTMKC YRTVS PKK FYDG P LA I LVS KSSASAAE I VACT LQDYGVALWGDEO 
TYCKGTrOHOTITGDASODDCFKVTVGKYYSPSGKSTOLOGVKSDILIPSLYAEDRLGER 
FLEHPLPADCCDNA/LHDPLTDLDTCTRPWFQKYYLPNLQKQETLWREMLPOLTKNSEQRL 
SENSNFQAFLSO I KSSEKTDLSYGSNDLQLEES IN I LKDM I LLQQCRK 

CPn_055' r J O40921 640325 

crpA-l5kai Cysteine-Rich Protein 

ENGMSS NLH PVGGTGTGAAAPESVLN I VEE I AASGSVTACLQAITS J FGMVNLL [CWAKT 
KF TOP r RESKLFQSRACQ I TLLVLG I LLWAGLACMF I FHSQLGANAFWL 1 1 PAAIGLIK 
LL\rrSLf;FDELACT5EKLMVFOKWAGVLEDOLDDCILNNnNKIFGMVKTEClNTf.R/VrTPVL 

ndcrgtpvl^flv:;kiarv 

CPn_0'i r .7 uA2H'/'i *A I I'M 

fjmcBMiKD.1 i:y::li:int.--Rir;h OHP 

El PM.;K L IRKWTVUM.T: :MA:7.CFA.'y I EAAVAE3LITK IVADAETK PAI'VL'MTAKKVR 
IA/RRNK0PVE0K:;R<;AFCDKKFY[<:EIY;PX:0('VFJUWE^CYGRLYflVKWM>:NVEIC0S 

v p ey atv ;: ; r y r I k 1 1 Ai i : k k o ; vnvy r tc>j i . p<-: EAEFVun dp ett rv: : t> l,vwk t diu . 
r ;a< ;dkck itvwkpi ,iu:a :t m-taatw :acpf.lr.';ytkccopa ic r kok :ft« :ai:[.hcfa/i : 

YK lEVVTn' f Af^'TVtJNFVFT/;Y:;ifArXX^F<VI,:;FNtXDMRr^ iDKKVF^VF 

<j i TNVAT*/TYt \;< :nKi::;,\Nvrp'y/r iEr- :v<jvn [ ;adw:;yvck pvey:: r.';v:;Ni* ifjlvli i 
uvviyiypLP.'rt :vTVta-:Aii;t;ia^^rfK'y-/wiUKFM:F^frrf J 0FKLWKAovi^.Ri , *i , NOVAV 
i^E:iN(y;TcrscA^r-rniwK(:LAATi!Mr'7t.i7rr)Dprcvr,i^wRicvTNRf;:^ 

LI[.KF::KEF J OFIA:;::* , ,rTK t ITF^fJMTVVKDAI.PKl^^K^JVEF.SVTt.KCtAh iDAHrjRAI 
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l: ; : ; un .ts i 'V: ■ nr enthvy 

hi i 10' 



lTn 0V,H hi nun *>4J0U 

SSaS 

VMC^P<:0--TECNr;Q.lPOVKGCTnPDGRCKQ 



.•.Nm-Vri.KIIKi'ltMiiM!'';. 
FFLARSVFNTCYNTNL 



645666 644098 



CPn_0560 , 
qltX-C lutamy 1 - 1 RNA W^^f cptgdphVGTAYMALFNEIFAKRFKGKMIL 
RN S R FQGMKS LWS KDKR I ^^F^^^^^^Qp^vGGWG PYRQSERTK I YQGVVETLLK 
R I EDT DRTRS RQ DY E ENI ^^5^HS rgGYDR RYRYLS P EEV AS REAAGQ PYT I RLKVPL 

SGECVF EDYSKGRWF ^^^i^y^^j^2^gKRj(NPTS I FYYRDSGYVKEAFVNFLTL 

TPKHLLLYE*F^PP^LI^^ 

MGYSMEGDEEVYSLERIIETFNP^ 

gXS^k^pkki^tvdkfmqredfeeatfdl 



CPn_0561 
euo 
LMACI 



646407 645871 



CPn_0S62 648051 646918 

Sl5%KEVAGH IQRHFDN 

CPteO>563 650113 648293 



KG! 
KLl 
SEPRFSD 



S^YSREHDDHVIQVASSINTSLVESDFSFVSYSSET^EHASS^RWa 



avglL^etglBfstetlnetqn^^ 
lgtlsglyiapplllfmvrkenrsk 

rPn_0565 655741 654533 

PNIVKKIITtYVLTPILGVPDALPKEEOENLKtLSOAAFLYSAEOVAKRMREEKQDSIRIK 
FiFTDPTSPTriLYFdPHliflSTPHRVTPISIjfcFVGEQESYTFA 

rf,B9l> 

.,vv,-ai/u! r>nn mwi i)ir.tKVLTUfTF f ]TENFGr'.PKEEIOEIFNIFYTOLDKUL.fti J ri 
'1 km : fur r : K PK 10TK iNH^RMTA^F^P.LEUVlAVNYajKDELVRAFKKLHVD 



B ,« TFl vr lly^oLFplt:;kauifitatcgavgty 

VUOIKFKSKTWYnDLro^ 



^57805 /SR464 



TQSKEFIG 
CPn_056<? 

■ •■i.'.A I'J 

SfeTrS^ 

CPn_0569 658/98 659099 

PlsC-Glycerol-3-P 4^"^cKFFW/AFSLFY KLKVYGVKKNFIKGPAIIAV 
LFG FDNKTS SG ^^tyuJ^J p * c ? cm T PWLWKQWGC F PVRQDEGNS AAF K I AS RLFN 

CPn_0570 
argS-Arginyl t 
TKLPSSKHGMNRO 
HYQCNDAMKLAR 1 



659044 660789 
^K^^flLSVICSOAIAKAFPNLEWAPEITPS^FG 

CPnj571 662179 660749 

CPn 0572 662349 664616 

wmam 

AASQVASTLGQWNQAATAGSOPSSRRSSPTSPRRK 

CPn_0573 665413 664691 

yebC family 
VEDMAGHSKWA 

E^I PNENI ERNLK^TS^U^«r^c.c. v i i ^ p^y^I EAg"aEDLDTEDEENFLVICAP 



rPn 0574 665978 665394 n/i/oa 



; ;,.h.H i,Ut . 



M h 57 Hi 7 
niylyt r.Hi:;re 



LRGYKKRVGGGYGRA 
CPn 0575 «6524 665982 

L 

^Si.e chaln'EE.. S 2 (natural UCA (r—Mft 

MQENLDKRLEALRTEI SLAARSL 

CPn 0576.1 pp75'j?J 

prtB- (natural UGA trame-shitr. ) 

MQENl.DKRLEAL.RTE ISLAAR:: L 

rpn 0577 U-^t o,;VH^ ( .,f,Hl'i'i 

E^ PMnQKNKN^AFMl^rW^ : A'DLAV [ V f JKf I ( 'M PRTE I VK KVWKY t K K I IH( "OOQK N K RN I L, 
POANU\KVFc;:*:*DnnMFLWrKAL:JKH I VK 

CH. £633S^" t!l ^ '•'•'"^ 

SwEKF^^^ 

rjnHDY:;i;YISRNTK('.KtTL'[l'F.KK::i<ft'jl<AI lAVM'Jf ILF^SP^KYDFNl.nVU Ml 1)1 . 
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DPSLPGLLLSHNPOC 
LARCYFVrKEGKQLYV 



LKL1 KNTPLTLLHWrTHVIPWrLNIVCUSDLFARQFI 
SRt^rGDFVLSCWSHOPOVTLSWPKFARKFFERU 
NRCLCXJLKRIRFCSPPEICYITCSYD 

CPn 057f , 669310 669993 

CPn_Q580 669936 670793 

?leSSsmpayglslhhvcysspynnfcce^svstsnk 

CPn 0581 ' 671533 670745 

GKEFFSYPSFDVLTEHCSQQKLL 

CPn 0582 671305 672177 



CPn_0583 



672349 672717 



CPM0584 672659 673798 

SEFI 1 1 PELLAALPKERMS 

CP*£0585 675880 673865 

EGi^DKMHAQAIKDCEAAORKCCDLESlXSPVREDAG^ 
EVES^EQEQFQG 

CPnSp586 675993 677183 

IDDKSFRODLYYRLlA/IPLHLPPLRDRQDDILPIJ^FIJ^FCP^NNTPLKTLSPKApEL 
LLNYPWPGN I RELSNVLERWILENTSLLTEDMLALA 

CPn_0587 677378 678124 

*yvyD Bs conserved hypothetical protein „, T — , V .- Dr eior vnr 

SYCEL?ILSTLLKHHVTLGDKMRPHRKHVSSKSLJ^ 
ILEKSDHLPPMETIRVVWSHK^^ 

RTMANKHSNKRKDRTKHDLGLAAKEERIAIQEEQEDRLSNEWLPVEGLDAW^KTL^YV 
PSLKPGFCI 

CPn_0588' 678033 678626 

LL*FU4HKI^PKKLIIIEKELKT^^ 

W Spw LLL^CKALAVNQNALKQLLGYNY2 ICATC [AMAGKHSQVSAL^PKTQTLDFY 
PGLPEYSGLLNYFISLNL 

<TnJ)5H') h78634 679395 

1 .HHN^r.^LPKL.'riKIDELNAFEAIKOTYALLEAljriKM ^^^'^^/^ ^^^^^^ 

i n hi f'E - ; • ;n r f.f f aa t rv lk llqy eh i ldlt pacj lcka:; i ,py ac y/i .hk lc k khqh k 

OA t [ EKEEEy I LQA I IHAKQF^ELLAI AEFP T AIAEKI FYLFD:.'BOEEKK.';ERN.'jIjEDP 
YllEILPklKWIIPY 

iTmJJVM) i.HOLV. b7'.i i :i^ 

CMVl hypol her. w:.il pror.rtin 



^^^JjJ.„ r ., w n ; -vr'"W' 'lliFF/'FYTHRFDF.'KoYPDHE 
LFLYrXHNLGFAeR7L£^ 

YHNDLVCFSEVTLIFNVSSEnCTITFS 

CPn 0591 S8o/64 691020 

flfj M RCTA YCT A3 A7TO<Vl/h LLK P P. Y FT I L3 R ^^[^^^^95^^?!!!? ^F«r^^ 

; F - r ; J,^*;; • I^'^X "j/, :'^ ; ..v: : r. :<•: :;• ~ a": ak>: i' ;KLi' LrKA::vNL 
hsdildep^f^ 

CPn_0592 768113: 681461 

^fISsF^FiW^^^^ 

ESS™ 

CPn 0593 / 682494 681391 

Jy^SSSS^^ 

TELKGSSWWLLBmDLKI>FP^ 

LACMSTQRVi 
LKLPAKQS 



^VQEDQDEEYVV'QDGDSLWLIAKRPGI PMDKI IQKNGLNHHRLFPGKV , 



682517 684958 



CPn 0594 

Kp^S^ffi^ANLVIPKEFSFE^PTTA^PDICPF 



IkTLSNQDI EEEYCRLVALLNELLTDTKGTINS 
0595 684943 685926 

SyS^^^c^ymSrbqdpsvmkitfi^giivsgqe^gsdcti 



AGGIyKHEWYYRGRSVSKAKFERLNAAG 



CPn_0596 



685930 686457 



CPn_0597 688215 686479 

AKLRS A I SFIQDKRLWI EKESEDLR ILINPFFSSF HV^DDAOTS REMNKYVPWWQL 
SHY^^I^^^ 

PAESAVLWPPAIILTMLLIAIALIGDGVRDALDPRLQDS 



CPn_0598 



689712 688219 



kgp^rywfreSgltlpi?f^ 

AFLVROLNEEDLnTKVEALKCWFQDHGGTEVFCYSSKOFWKTFF^ 
FGTl^DAHKTVISEVIKRLRCSLVM^^ 

FLILFSrPVFVAVPWI LDN FV 1 MKT I P FTT IPMPYSGLRSP ^ ^Y F ^ f H VT KN AAVS I V 

gfWcavsygaiawjsrlsrs^ 

T S LAS 3 LGT LLGG ALWET LFM I DG FGNF F YQ A I LN RDM NWLF 5 V LVC SALS L VG Y LLC 
DICYVLLDPRVQLEGRRI 

CPn_0599 691823 •139682 

oppA-oLigopepti.de Binding Lipoprotein , mnronrrtPnrD c TC qw 

SrW,hmykr^pkilkgiva^^^ 

VKQQOTSQA [ PAAFC.VMLAPKLVRDEAF ALL.FriDP3YPNLLSLDPYKC^TLPELL^TNFH 
PM^U LRT Af I VG K T EN L ;5 P F NC F LYWG FY DLC I PCLA:JPMVOKYEEFnrDLAVKIEEHLV 
fc'DCXDKEFH IYLRFMVFWRP [ LPKALPKHVOLDEVFORPHPVTAIIDIKFFYpAVMNPYy 
AWPAVALRSOYEPW^^^ 

PRFVYQYFANTIEK 1 1 EDEN IDTYRTOfl I WAQNFTMHWANNY [VMAYYI AC -MDDEK IVF 
:;RNl'UF , VDPLA\L.lPKRF\ATKErTP:XFQPKKT f IK tDI SYliPPNORDNI^ ^FMK:>.iAYN 
KOVAKf X lAVRETVf'ADUAYTY I > ;**:F::i .FFtf: :if jVRi :AMNMA t UKF.t U W -LB : ^ - YT 
r//|M-A':r.:lP:;YNKOIEGWIlY::r.EFVVAI'!^KKB 

YYVK:;vTAiiTrAn\YAi*ACKi-:['JiK(::;[J/;u»MAiJL::oAK J fyNKnALi^ 

F.RA[.WM.^lAMKK l '::ANW^FIIIIErO\r;K[[rmi.;;YKYlJI.K[a^ir < [.YnRl-MIKm 
KIJ.\';kHCSLLYKO\VKN[FVFT!»'TnLri-Wl'l-:n/tJ\ri , MVWIJ-:KKKlM't u.i- 

«:i'ti_'i^oo (.'•;: t i >. •.•»] 

No irihiir.t. li>>iin^Ui<i |n.-::':ni ii. ' :« -il I -ink/ KMBL 
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HGYMKrKK^FOYSLCOAKRFCJNMLPNHFDPCLOPVNl 
F3SLLKEETC3LNRAKQHLLYKrLRDFHTMQHLR: " 



LAYGEL I I LLSKYQQKT 
IPMSPCL 



CPn_0601 693092 692736 

CT493 hypothetical protein 

QFPRIHADDt[NSMDErTPNYPLLRQDSLWNRVRVSWRADLSVSSRYEIASAIAILGLLV 
AFC ASAAVS I I FT ANP LAQVF I DGCLALG LLP I PLV IGLL I IG I IVLLYG IYLFPQQRE 

,: lO>. ^"T .. 

\T I-* i hyfui.!i.:i io-ii i-t'-t.-rLM 
DSCFMKPLGFQENLEALCNKTSRQLLKYLIKQILFVCGASLLIALEFSFFLYFFLFSGKT 
VIPAFCLACFFLTLFVCLVTRLYLLSGKGDFFEDLASEYLQGAVPPNKRSONIVEEQSHL 
AAAATKLS INLQNQEYSLLS EIFKFL PKHDLI RKFSC FC FWKDYFLFRECLLQKA I EAY I 
KWQAI PVDLSAHVSLADAYVALSGLYADPRKYPEFDANYWI PSGRYSAEIQEKFFATAR 
RAIEEFQILNEYAPGNAWVHAQLAYSYHDUJMPMEEIQE^EIVLKLKPNDVETMSKLGIL 
YFQOGMNAKGLR I YEE I KKRDYKKSQKL I KFYGVEYKY 

CPn_0603 694136 695185 

hemZ-Ferrochetalase 

WKIMRLIVU^QCLVSLFLJUCKVTVTTPAYLLANFGGPRHAKDLiQEFLIStXTDRDW 
LPRVLHRHLFTF IAKKRVPKVLPQYQSLQNWS P I YFOTETLAKTLSE ILRAPVI PFHRYL 
PSTHEKTLLALRTLHTRHVIG I PLFPHFTYSVTGS IVRFFMKHVPEI P I SWI PQFGSDSK 
FVSLITCH I RDFLQKLG ILEKECCFLFSVHGLPVRY ISQGDPYSKQCYESFSAITTNFKQ 
S ENF LCFQS KFG PGKWLS PSTAQLCQN I DTDK PNVI WP FGF I SDHLET LYE I ERDYLPL 
LRSRGYRALRIPAIYSSPLWVSTLVDIVKENSTWAEELIKSGKKHTGIR 

CPn_0604 695931 695196 

fliY-Glutamine Binding Protein 

CKKRQNSEAQLNVKIKFSWKVNFLICI^VGLIFFGCSRVKREVLVGRDATWFPKQFGIY 
TS DTNAFLNDLVS E INYKENLN INI VNQDWVHLFENllJDKKTQGAFTSVLPTLEMLEHYQ 
FSDP ILLTG PVLWAQDS PYQS I EDL KGRL I GVYKFDS SVLVAQN I PDAV I S LYQHVP I A 
LEALTSNCYDALLAPVIE^/TALIETAYKGRLKI ISKPL^ 
AGLVKTRRSGKYDAIKQRYRLP 

CPn_0605 696737 696150 

yhhF-Methylase 

LRKLC SSRGDVR I LAGKYKGKSLKTFSNPH IRPTSGLVKEAFFS ICRED I EGAAFLDLFA 
GMGAIGFEALSRGAASWFVDISIKAIQLIHTNSALLGEQLPWIFRQDAQSAIQRLIKQ 
KRSFDLIYIDPPYELCNCYVETLI^KIVSGNILNPEGTLFLENASDEEIACEGLTIJIRRR 
KLGKTYLAEYIVEKDP 

CPR10606 697492 696707 

CT388 hypothetical protein 

S 3 YSRRQLRFYTGS LQMH I YGLADUiLALGVP E3CTMEVT"GDPWIGYHC;K ICS EWQ A WH P 
ED^LPGDISWAMNLSEAHKDFAFIGDLPGTKYMIRGNHDYWSSASTSKILQALPPSLY 
YLN^FALLTPHIJWVGVRLWDSPTICVKK^F^ 

AHi^PKEVTEVIWHYPPISSEXn'PGPISEFLEAIXSRVSIXLFGHIHKVQRPIDGFGN 
I |GI HY I LV AAD YVN FVPQ EVM 

C$nL0607 698910 697573 

gigT-Glucose-l-P Adenyltransf erase 

NRMQMI ENDFPEASNFES SHFYRDKVGVI ILCGGEGKRLS PLTNCRCKPTVSFGGRYKL 
lOf # I SHAI SAGFSKI FVIGQYLTYTLQQHLFKTYFYHG^ 

. QdT^iDAI RKNLLYFEDTEI EYFLILSGDQLYNMDFRS IVDTAI RTHVDMVLVAQPIPEKD 
AY^MG^^IDSEGKLIDFYEKPQEKEVIJCRFQLSSEDRJIIHKLTEDSGDFI^SMGIYLFR 
RDSLFS LLREEEGNDFGKH L IQAQMKRGQVQT LLYNGYWAD I GT I ESYYEAN I ALTQKPH 
AEKRGLNCYDDNGMIYSKNHHLPGAIITDSMISSSLLCEGCVINTSHVSRSVLGIRSKIG 
ENSWDQSI IMGNARYGSPSMPSLGIGKDCEIRKAI IDENCC IGNGVKLQNLKGY I KYDS 
PQKKLFVRDN I I I VPQGT H I PDNYI F 

C&V0608 699690 699016 

*t|rfdine 5 * -Monophosphate Synthase (Qmp Synthase) -truncated? 
V^YFVKNGRRLWRMMNYEDAKLRGQAVAILYQIGAIKFGKH ILASGEETPLYVDMRLV , 
I S-SPEVLQTVAT LI WRLR P S FNS SLLCGVPYTALTLATS I SLKYNI PMVLRRKELQNVDP / 
SDA^KVEGLFTPGQTCLVINDMVSSGKSIIETAVALEENGLVVREALVFLDRRKEACQPL 
GP^GIKVSSVFTVPTLIKALIAYGKLSSGDLTLANKISEILEIES 
, 

Cp : fef0609* 699672 699966 

CT490 hypothetical protein 

QNTKNSLIRENMLIRLFLG I SLPKGFPLYLEPPLVLATFQGTQFVGTYSEATNPLY y 
NLNYHYTQELLYKAVPCNYKSIYREI PLI I FPEVLIGSTPTQSTE 

CPn_0610 701450 700029 

rho-Transcription Termination Factor 
R I F LRFKGS I MKEERS S E I LPRVKEf KKH AYVSMQ EKSCVG ECAWAS ES EEAtfSVTVTK 
[ AKLQRMG I EELNILARQYGVKNIGSLTKSQWFEI VKAKSERPDELLIGEGW.EVLPDG 
FGFLRS PTYNYLPS AED I YVSPAQ IRRFDLKKGDTI IGT IRSPKEKEKYFALtKVDKING 
STPDKAKERVLFENLTPLYPNQRIVMEMGKDHLAERVLDLTAPIGKGQRGLBVAPPRSGK 
TV I LQS I AH A I AVNNPD I VL I VLL I DERP EEVTDM I RQVRGEWASTFDEQP ERH I QVAE 
MV I EKARRLVEHGNDWI LLDS ITRLARAYNTVOPHSGK I LTGGVDASALMK PKRFFGAA 
RN I EGGGSLT I LATAL I DTGSRMDEVI FEEFKGTGNMELVLDRRLSDRR-DYPAI DL I KSG 
TRKEELLYHPSELERVYLFRQAIADLTTIDAMHLLUJRLKKTNSNAEFL/ISLKE 

CPnJ)611 702133 701420 

yacE-predicced phosphatase/ kinase 
RRNRRDAKTSEREDGISYDFIRSYSCEYLNWKKLGRMLKLLKVSITdbLSSGKTEACQVF 
OELGAYWS ADE 1 5HS FL I PHTR IGRRVI DLLGSDWVDGAFDAQMMKVFYNSVLLQG 
LEA I LH PEVCR 1 1 EEQYHQS IQDCNY PLFVAEVPLLYEI HYAKWFBSV I LVMANED I RRE 
RFMKKTGRSCEDFDORCSRFU^VEEKLAOADVW'EhJNGTKKELHglK I EEYFYALKCAL 

CPn_t)hU 70-1638 70 J 022 

po 1 A - DMA Pt i I y me i . i I 
K(:iMTr;LLGTVERF»RREYAMKKLPVLDA^:FIFFtAYFALPEM^NHQGQATQAVFGFIRSL 
NKLIKEF3PEYM LlJVFCX il'NNKQSROA [YADYKSNROKKFEB I PPO I ALVKEYCSLIGLA 
YI.EKE::VEADI)V [AS I AKKAREFJ^KWVTADKDLLgLVNDHVVAWNPWAiyjGWG irJE 
V [ EP.Yf : I PI*;MI PDYLALW JD^DN t PCLPOTGPKKAMLLKQFGSVEGLLENLDAVKGL 

wOTML.'jEROETLKL:?KRLALLD:]N i pipvpiesltfpoh wdeeklihfyioqgfktlvp 

: ; KOTEAATV DV'J I I KPAK: :LTN t LJJLVlVh'D [ AFAVAYTff JNHLf^LKLEGLALTQGSGVF 
F [ ALEEECiTK [ LI' t IiKOI'E'IiREDI.TFYCIYNLKRDCMALLNAf I tV tRE UiYDLALAEHLTN 
< ;r;r;K r .' ; I'Q: ', I ,LVNI K ;FTf\TA! IR FAKEWt."N:"CLP top. l^eopeoyfgefvaylp [ IKDATL 
EEt t IPKMI ,NH [ L: ID I EMlM^EKVLF.'IMERAiIVrLDVRratA 1 1 .KAL.FETELAVLTEE I YDL3 



GRFFNIK3PKGLSDILYN^^J[DKAk£tPAEVLEALR:;FHPI -EKLLAFRTIEKLLS 
TYVKALPKQVDSHTQR IHplBPrGAi.T/',RU\CRDFNLON t P I R3ERG t LLRKAFRLSEK 
NSYFLSADY3QIELRFLAHL3QDKSLKFAFEjGEDIHAFTAJ0VFHVPLEQVSKEQRMQA 
KTVNFG rVYGQOAFCLAKVLK ISIGErfQEL ICAYFSRYPE I AHFVEET IO0AAKDLRVTT 
MLGRER I IDSWNEFPGSRAASGRFAV/ITR ICOSAAELI KLAMLDISQAIKCCQMKSRMLL 
O I HDELLFEVPEEEI EEMQRLVREKf(E5AMTL3VP I WN I L rCKTJWAEC 



705662, 



7(M * s 3 



CPn_06p 

:.*hl> i : ' ■ 
WMHTLWH:" 

KTAPI IAVI EMKDVI A33KNTAKTIUN i LEGrEKAf LKDRVKG I VI E3MDCPGGEVFEXDR 
I YSMLRFWKERKGFP I Y rr^NrfLCASCGY YV3C AATK I Y ATSSS L IGS I GVRSGPFFNVK 
EGLW^YGVESDLLTAGKDKAPHNPYTPV/TSHDREERQATLDFLYGQFVTDIVTC^PLLTK 
"EKLVHTLGAR I FSPEKAKQEQsY I DWGATKEQVLQD I VAVCKI EDNYRVIGSGGDGWWKR 
VASAAASSPLVTGMIKHDI VPLSHDAAY I PPYLAL 

CPn_0614 /07435 705783 

adt -ADP/ATP Translfocase 

VFIRHKVGKEFMQSSEVKPFSRLJlfcYLCPIYKSEFSKFVPLFLJLAFF^ 
TLVIVGSDAGAEVIPFpCVWGIVTCAVIVTMVYG^SRYPRDTVFf 
AVIIYPVGDSLHLNSL^KLQELIjPC^LRGFIVMVIIYWSYSIYYVMSELWSSVVI^MLFW 
GLANQITTITEAGRFTALINTGLNLSSICAGEISYWMGKGTFVAYSFACDSWHSVMIJ^ 
ML ITCSGL IM IWLYEW IHHLT I UTS I PPS RRVLAEEGAATANLKEKKKPKAKARNLFLHL 
IQSRYLLGLAI IVLSYNLV I HLFEWWKDQVSQ I YS SHVEFNGYMS R ITTL IGWSVLAA 
VLLTGQC I RKWGWrVGALVT PLVMLVSGLLF FGT I FAAKRD I S I FGGVXiGMT P LALAAWT 
GGMQNVLSRGTKBTFFDQTKEMAF I PLS PEDKNHGKAAI DGWSR IGKSGGS L I YQGLLV 
IFSSVAASLNVI^LVLLI IMWWIAWAY IGKEYYSRAADAVATLKQPKEPSSS IVREAQ 
ESVEQEEMAVL/ 

CPn_0615 / 708149 707634 

pgsA-Glycerol-3-P Phosphatidyltransf erase 
LAKIMRQFokxSLSRLWLALYFCGEKUnPX^ 

ILDP ITDiOPFVFVC ITVLYMEGSLS I AHLFF ICARDLFL 1 1 FVCYLS LVKGWKGYDYGS L 
FWGK I FTWQF 1 1 LLGVTAGGE I PWTGLVPLVALGFLYFLER I MD YKKQ F LR 

CPiuOe/e 708704 710137 

dnaB-Bfeplicative DNA Helicase 

TLTNYESSLLMDKSTGVPLPSP PHSKES EMIVLGCMLTGVHYLNLAANQLYEEDFYYLEH 
KI I FRVLQDAFKQDKP I DVHLAG EELKRHNQ ITV IGG PS YLI TLAEFAGTAAYLEEYVD I 
IRSMS I LRKMISTAKEI EKRALEQPKNVAEALDEAQNSFFKI SQSTSVSQYTLVADKLRG 
LTTTTDKPYLVQLQERQELFLQNACGDNKSFFTG I PTHF I DLDQL I HGFS PSNLMILAAR 
PA]^KTALAIJJIAENLCFQ^^^PIGIFSLEMTV^LIHR^IICSRSEVDSKKISIGDLSGH 
DFmiVSVINEMQEHTI^IDTOPGUCVSDLRARARRMKESYDIQFL^ 
ATESRC/TEISEISRMIJCTLAREI^IPILCL5QL^RKVEDRAJfflRPMMSDLRESGSIEQD 
DLVMFLLRREYYDPNDKPGTAELI IAKNRHGSIGSVPLVFEKELARFRNYSAFECIS 

/CPn_0617 710481 712316 

' gidA-FAD-dependent oxidoreductase 
LMWTHP IAYDVI\A^GAGHAGCEAAYCSAKMGVSVLMLTSNLDTI AKLSCNPAVGG IGKGH 
IVREIDALGGIMAEVTIXJSGIQFRILNC^KGPAVRAPRAQVDKQLYHIHMKRIiLENT 
HIMQATVESUJDKEGVISC^TTKEGWMFSGKTVVLSSGTFTIRGLIH 

GI^EDLKKRGFPISRLKTGTPPRLIASSINFSCMEEQPGDLGVGFVHRTEPFQPPLP 
^SCFITHTMEKTKAI ISANLHRSALYGGCIEGVGPRYCPS IEDKIVKFSDKERHHVFLE 
^EGIJiTQEIYANGLSTSMPFDVQYDMIRSVLGLENAIITRPAYAIEYDYIHGNVIHPTLE 
^ SKLIEGLFLCGQ INGTTGYEEAAAQGLIAG INAVNKVFNRPPFI PSRQESYIGVMLDDLT 
TQII^EPYTWFTGRAEHRLLLRQDNACARLSHYGYEIXSLLSEERYELVKKQNQLLEEEKV 
RLQKT F RQYGQS WS LAKALS R P EVS YDMLREAF PND I RDLG A VLNAS LEME I KYSGY I D 
RQKILIQSLEKAESLLIPEDLDYKQITALSLEAQEKLAKFTPRTLGSASRISGIASADIQ 
VLMIALKKHAHH 

CPn_0618 712300 713010 

lplA-Lipoate-Protein Ligase A 

KNMPTTNC I FLDLRGHS ILHQLQI EEALLRVANQNFC 1 1 NSGAKDS IVLGIS RNLNQDVH 
ISRAQADHIPIIRRYSGGGWFIDSNTLMVSWIMNSSEASAQPQELLAWTYGIYSPLLPN 
TFSIRENDYVLGHKKIGGNAQYIQRHRWV^HTTFLWDIDLDKLSYYLPIPQQQPTYRNQR 
SHEEFLTTLRPWFPSRDDFLERIKASGSLLFTWEEFLDNELEEIUVQPHRKATTVLN 

CPn_0619 713462 713013 

ndk-Nucleoside-2-P Kinase 

RRYVYTMEQTLS I IKPDSVSKAH IGEILSI FEQSGLRI AAMKMMHLSQTEAEGFYFVHRE 
R P FFQE LVDFMVSG PVWL VLEG AN AVS RNR ELMG ATNP AEAASGT I RAKFG ES IGVNAV 
HGSDTLENAAVEI AYFFSKI EWNASKPLV 

CPn__0620 714145 713519 

ruvA-Holliday Junction Helicase 

DKMYDY I RGTLT YVHTGA I V I ECOG IG YH I A I TERWA I EC I RALHODFLVFTHV I FRETE 
HLLYGFHSREERECFRILI5FSGIGPKLALAILNALPLKVLCSWRSEDIRALASVSGIG 
KKTAEKLMVELKQKLPDLLPLDSRVET3QTHTTSSCLEEGI0ALAALGYSKIAAERMIAE 
AIKDLPEGSSLTDILPIALKKNFSGVNKD 

CPn_0621 714707 714U4 

ruvC -Crossover Junction Endonuc lease 

L3RLGSSFKDNKFKVF0ES tVSELI IGVDPCT IVAGYAI IAVEQRYQLRPYSYGAIRLSS 
DMPLPMRYKTLFEQLSGVI.DDTQPNAMVLETCFVNKNPQ3TMKLAMARGIVLLAAAQRDI 
L T FEYAPNVAKKAWGKCHA3KR0VQVNT/GK r LNVPEVLHPSNEDIADAFALAICHTHVA 

RSPLCGVR 

CPn_0ft22 7 1S761 714VM 

CT'jO'i hypothntic.il protein 

F^Y:JVPfX:JrLKLHLF^;LR:^'^S^L:;r>^IYYHf'^:::R3MLHLLCRWKDADIMEWQQ[CNILSGV 
^::PMrx;KLV:U^KL^D:^MrOEHERIMWYREOL::ALEEEYRRREEAKWODLEKLOOENT 
ViUJlk f .AEKI.00 1 RHO:!n I I PE r KKF.LL/jr: VORTE I :;E^RP.LCYF.i I K I KOLEEQLORYVS 
OH' lAl'i; t [■: [ EEOK^: 1AAYAK t NPLKK. f JL [ DLOOEKD I Y t KTYHSE I AKLHKKLQRQEGAQ 

•I , :::;f•:v^':;rEKLT^:vfY^Dt J ^l■:KKKAtALLOLHV[■:t>JY(:o^P.or 1 HKE^^ 
* d.u ;kkp i-:: ; i ; :v ow F: : k: : k ; : i ; ; : ; 

*:ii\j)f,:\ i '■' i /in i /!».h,( 

' T'.'M hypor h»:t. ic.i I pi . ,t .,■ \ t) 

I K f ;Yf JYVYf-TRDf'V TF.TV I y ;YKl.:;7ltNTKHF:;ODI'KMVFj\ C KVI.'IL/JN H .'FFRNtTD 

n:;K[i'LV['Ai;r.)Yr:vMi':v r Ri tk; i mi.kav ;i,Di\t ;vk t A f )f iPE/^l i kltk.itplpv i dekpla 
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F.ELWEE: HJENE I VEQK KF5 LLPPPAKL t.1 EV I : 
[NALL.'JADUArHFPETEEEPT3AoFEE53AMFFPET! 



F-PETS^^H 



'LNES LQALVR ES5DL 



CPn_Oh24 718018 717011 

gapA-ClyceraLdehyde- 3-P Dehyrogenase 

AMKWI NGFCR IGRLVLRO I LKRNSSVEVLAirfDLVPGDALTYLFKFDSTHGRFPEDVRC 
EADHl I VCKRK IOFLSERWVQNLPWKDLGVDLVI ECTGLFTKKEDAEKH IQACAKRVLIS 
Ar^KCnirTFVMHVNHKTFNPF.KDP/iriNASC^^ 

.\ :■ yivi .wi . :i . Tr I ww v ;c » :ujUi i \ *A:ri> ; a a k a vt u .* L p e L h v ; k LT r ;ma f r v p [ k.ov 

I ALNDR FFK LVAWYDNETGYATR I VDLLEYVEKNS K 

CPn_0625 718488 718060 

rll7-L17 Ribosomal Protein 

VMQHARKKFRVGRTSSH^CMLANMLKSLIHYERIETTLPKAXELRRHADKMITLAKKNS 

UW^RIAIGRLMVRYNKLTSKEARQAKGGDTSVYNVDRLWNKLFDELG^ 

R I LKLQNR IGDNAQKC I I EFLAS 

CPn_0626 719670 718495 

rpoA-RMA Polymerase Alpha 

^PAKKKAQSWLGKEKGMSDNAHNLLYDKFELPE^VKMLPVEGLPIDKHARFIAEPLER 
GMGHTLGNALRRALLIGLEAPAIISFAMTGVLHEYMAIEGVIEDVTNIILNLKGALLKKY 
PMQDSSLGRTTQVLKASISIDASDIJkAANGOKEVTI^DLLQEGDFEAVNPE^VIFTVTOP 
rQLEVVLRIAFGRGYTPSERrVLEDKGVYEIVLDAAFSPVTLVNYFVEDTRVGQDTDFDR 
LVLIVF/TDGRVTPKEALAFSTOILTKHFSIFENMDEKKIVFEEAISIEKENKDDILHKLI 
U3INEIELSVRSTNCLSNANIETIGELVIMPEPRLLQFRNFGKKSLCEIKNKLKEMKLEL 
GMDLTQ FGVG LDNVKEKMKWY A£K I RAKNT KG 

CPn_0627 720059 719640 

rsll-Sll Ribosomal Protein 

FL I RSRVLVKNQ AQAKKSVKRKQLKN I PSGWHVKATFNNT I VS ITDPAGNV I SWASAGK 
VGYSGS RKS SAFMTVAAQDAAKTAMNSGLKEVEVCLKGTGAGRESAVRALI SAGLWSV 
IRDETPVPHNGCRPRKRRRV 

CPn_0628 720461 720063 

rsl3-S13 Ribosomal Protein 

DAYT ILREAQRMPR I IGIDI PAKKKLK ISLTYIYGIGSARSDEI IKKLKLDPEARASELT 

EEEVGRIi^SLI^SEYTVTCDLRRRVQSDIKRLIAIHSYRGQRH 

RKGKRKTVAGKKK 

Cffg.0629 721881 720487 

se^Y-Translocase 

Kl gFRPYMTTLRQFFL ITELRQKLFYTFALLTACRVGVF I PVPGINGELAVAYFKQLLG 
SG^FQLAJDIFSGGAFAQMWIAI^VVPYISASIIVQLFLVFMPAi^RE>{RESSDQGKR 
R IGRLTRLFTVALAVIQSLLFAKFALRMNLTI PG IVLPTLLS SKLFGVPWI FY ITTVWM 
TTGTLLLMWIGEQISDKGIGNGISLIIALGILSSFPSVWSIVNKL.NLGSODSSDLGLIS 
I & £SAXiVFVFVL ITT I L 1 1 EGVRXI PVQYARRVIGRREVPGGGSYL PLKVNY AGVI PVT F 
AS^I^IFPATIGQFIASESSWMKRIAALLAPGSLVYSICYVLLIIFFTYFWTATQFHPEQ 
lAS2MKKWAFIPGIRC£KPTQHYLFn^^ 
SYiELGGTAMLIWGVVLCTMKQVDAFLLMRRYDSVLKTDRTKGRH 

/ 

CBflJi0630 722316 721885 

rf£5-L15 Ribosomal Protein 

mxkSesltoiserkrrkklijgrgpssghgktsgrghkgdgsrsgykrrfgyegggvplyr 
. rvptrgfshkrfdkcveeittgrij^lfqegeaitldalkakkaiaroavrvkvilkgdl 
EXT fvwqdtawlsqgvqnllg it 

Cfiffz0631 722812 722312 

rs5rS5 kibosomal Protein 

EfeLSKNSHKEDOLEEKVLVVmCSKVVKGGRKFSFSALILVGDGKGRLGYGFA 
TD4|RKGGEAAKKNLMKI EALEDGS I PHEVLVHHDGAQLLLKPAKPGTG I VAGSR IRLI L 
EMAG I K D I V AKS FGSNNPMNQVKAAFKALTGLS PRKDLLRRGAAI ND 

CPn"l ; 0632 723354 722827 

rlh@-L18 Ribosomal Protein 
KGqi^SWLVNLLQVFAPWLLNLIKWEF'V^ 
KSi^KRRRALRVRKVLKG S PTK P RLS WKTNKH I YVQL I DDS IGKTLASVS3T LS KLNKSQ 
GLTKJCNQEVAKVLGTQIAELGKiJLQLDRV^ 

CPn_0633 723760 723209 

rl6-L6 Ribosomal Protein 
SMSRKAREP I LLPOGVEVS IQDDKI IVKGPKGSLTQKSVKEVEITLK^NS I FVHAAPHW 
DRPSCMQCLYWALI SNMVQGVHLGFEKRLF^IGVGFRASVQGAFLDLS IGVSHPTKI PI P 
STLQVSVEKNTL I SVKGLDKQLVGEFAAS IRAKRPPEPYKGKG I RYENEYVRRKAGKAAK 
TGKK 

CPn_0634 724215 723787 

rs8-S8 Ribosomal Protein 
ESS IKRKRI YMGMTSDS I ADLLTR I RNALMAEHLYVDVEHs/mREA I VK I LKHKGFVAH Y 

lvkeenrkramrvfloysddrkpvihqlkrvskpsrrvyv/aaki pyvfgnmg isvlsts 
covmecslarskniggellclvw 

CPn_0635 724763 724206 

rl5-L5 Ribosomal Protein 
gerkanmsrlkkfyteeirkslfekfgyankmoipv/kkivlsmglaeaakdknlfqahl 
eeltm i 3cqk p lvtkarns i agfklr egqg ig akvtlrg i rmydfmdrfcn i vs pri rdf 
rg fsnkgdgrcc ys vc lddqq i f pe i nldrvkrtqc ln i twvtt aqtddecttllelmc l 

RFKKAO 

CPnJhiV, ■ 725 LOO 72475/ 

rllM-LIM Ribosomal Protein 
FK L'.K EVMK KON I R VG DKVF I LAGNDKCKEGlA/LS LT EDKVW EGVMVR I KN I KRSQQNPK 
';KR t:;i EAP [fit :irWR LT I Af lEPAKLIJVKyTEQCRELWQRRF'I^yrSOLYRLVRGKKG 

'.T'nJK". (7 7L'M7i V/l'jO'J') 

r I 1 '1 -1,1 -1 k iU>:;oni.i L fror..:. 
I K I'M IuuK-'IOI >K VA DNTt * A K K V Kt * F K /lGO C R R R Y AT VG I ) V I" V( ,\' ^ V R DV E PN : U K KG D V 
I KAVl VRTHRH tTHKrXII'TI.KFDTN^frvr IDDKGNPKTrp. t FGPVARF. [-RDRGFrKISSL 

AI'I-A/t 



rsl7-:;r' Ribosom.il 

NK K EKVKSMA5E PPGGR KVK^^S AKMEKTV^VR VER [ FSHPQY LKV/RS 5 KKYYAHT 
ELKVSECDKVK IQETRPLSKLKRWRVIEW 

CPn_063? 725979 725/43 

rL29-L29 Ribosomal Protein 

A3GKG I NMAAKKDLLTQLRGK5 DDDLDAY/vHENKKALFALRA EMLLQNKV/KVHMFSTHK 
KN t ARALTVKOEPKGKVHG 



rllo-LLb Ribosomal Protein 

i imi^pkrtkfrkc^kgqfaglskoatfvdfgeyamotlergwvtsrqi eacrvainryl 
krrgkvwirifpdksvtkkpaetr/Igkgkgapdhwawrpgrilffa/anvskedaqdal 
rraaaklg i ktrfvkrverv 

CPn_0641 7270/92 726409 

rs3-S3 Ribosomal Procrein 
KGRR IMGQKGCP IGFRTGVTKRWRSLWYGNKQEFGKFL I EDVR I RQFLRKKPSCQG AAGF 
VVRRMSGKIEVTICTARPG£VIGKJ<GA£VDLJ^^ 

V ADN I ARQ I ERR VS FR RARKKAMQ SVMD AG AVGVK I Q VS G RLAG AE I AR S EWYKNG R VP L 
HTLRADIDYATACA£TITGI IGIKWINLGENSSSTTPNNPAAPSAAA 

CPn_0642 / 727440 727096 

rl22-L22 Ribosomal Protein 
RRHSMFKATARYIwQPRKARI*AAGLWRNLSVQEAEEQl^ 
AELHEN I KRENLSJvTEVRVDAG PVYKRSKSKSRGGRS P I LKRTSHLTVI VGEKER 

CPn_0643 / 727725 727450 

rsl9-S19 Ribosomal Protein 

EIRIMGRSI^GPFVDHHLUCKVRAMNIEEKKTPIKTWSRRSMITPEWIGHT 
FLTVWSETMVGHKLGEFSPTRI FKSHPVKKG 

CPIU064/ 728594 727722 

rl2-L2/Ribosomal Protein 

F I RE INSMFKKFK PVT PCTRQLVL P AFDELTTRGELRGTKSKRSLR PNKKLS F FKXS SGG 
H I SCRilRGGGAKQLYRVVDFKRNKDG ITAKVVWEYD PNRS AYI ALLSYEDGEKR 
YILA^KG IQRGDWVSGEGSP FKPGCCMTLKS I PLGLSVHNI EMRPS SGGKLVRSAGLAA 
QVIAKS PGYVTLKMPSG EFRMLNEGCRAT IGEVSNADHNLRVDGKAGRRRWMGVRPTVRG 
TAJwIPVDHPHGGGEGRHNGYI PRTPWGKVTKGLKTRDKNKSNKWlVKDRRK 

_0645 728933 728598 

rl23-L23 Ribosomal Protein 
/'DMKDPYDVIKRHYVTEKAKMLEHLSAGTGEGKKKGSrcKJJPKFWIVSHDATKPLIAQAL 
EAIYVDK^A^CVKSV^^^INVKPQPARMFRGRRJCGKTSGFIC<AIVTFYC^HSVG 

CPn_0646 729636 728950 

rl4-L4 Ribosomal Protein 

NsEVSHSTKJCPFKQKGTGNARC^LASPOFRGaSIVFGPKPKJr^HVRINRXER^ 
JlAQK IG/TNKXTVVDDTVTVDALTAPKTOSAIJIFIJCDCNVBCRS I LF I DHLDHVEKNEKLR 
' LS LRNLTAVKGFVYG I N INGYDLAS AHN IV I SKKALQELVERLVS ETKD 

CPn_0647 730490 729657 

rl3-L3 Ribosomal Protein 

YLEYFSYCK^PPLITCPFIFLRENFLFFLENSISKILSRFVSLFLQEESKSLLLiMDKFM 
RS H I SVMGKKEGMI H I FDKDGSLVAC SVI RVEPNWTOI KTKESDG YFSLQ IGAEEMNAP 
AHT ITKRVS KPKLGHLRKAGGRVFRFLKEVRGSEEALNGVS LGDAFGLEVFEDVSSVDVR 
GISKX3KGFQGVMKJCFGFRGGPGSHGSGFHRHAGSIGMRSTPGRCFPGSKRPSHMGAENVT 
VKNLEV I KVDLEKKVLLVKGAI PGARGS I VIVKHSSRT 

CPn_0648 731636 730605 

CT529 hypothetical protein 

FFFKJCPCKEVKMATOAIRSAGSAASKJ^PVAKEPAAVSSFAQKGIYCICOFFTNPGNKL 

AKFVGATKSLDKCFKLSKAVSIXVVGSLEEAGCTGDALTSAJ^C^MLK^ 

LNGAVP S I VNSTQRCYQYT RQ A F ELG SKTKERKTPGEYS KMLLT RG DY L LAASREACT A V 

G ATTYS ATFGVLR PLML I NKLT AK PFLDKATVGNFGT AV AG I MT I NHMAGVAGAVGG I AL 

EQKX.FKJ^AKESLYNERCALEWC^SQLSGDVILSAERAIJlKEHVATLKJlhA/LTLLEKALEL 

WDGVKLIPLPITVACSAAISGALTAASAGIGLYSIWQKTKSGK 

CPn_0649 732672 731710 

fmt-Methionyl tRNA Formyl transferase 

LNLKWYFGTPTFAATVLQDLLHHKIQITA\An , R\mKF<JKRSAQLIPSPVKTIALTHGLP 
LLQPSKASDPQF I EELRAFNADVF IWAYGAILRQIVLDI PRYGCYNLHAGLLPAYRGAA 
PIQRCI MECATESGNTVI RMDAGMDTGDMAN r TRVP IG P DMT SG ELADALASQG AEVL I K 
TLQOIESGOLQLVSODAALATIAPKLSKEEGQVPWDKPAKEAYAHIRGVTPAPGAWTLFS 
FS EKAPKRLMI RKASLLAEAGR YGAPGTVWTDRQELA I ACS EGA ICLH EVQVEGKGSTN 
SKSFLMGYPAKKLKIVFTLNN 

CPn_0650 733513 732665 

ipxA-Acy l-Carrier UDP-GlcNAc O-Acyltransf erase 

SRRNMAS IH PTAI I EPGAK IGKDWI EPYWI KATVTLCDNWVKSYAY I DGNTT IGKGT 

TIWPSAMIGNKPODLKYOGEKTYVTIGENCEIREFAI ITSSTFEGTTVS ICNNCLIMPWA 

HVAHNCT IGt^WXSNHAQLACHVOVGDYAI LGGMVGVHQFVRIGAHAMVGALSG I RRDV 

PPYT IGSGNPYOLAG I NKVGLQRRQVPFATRLALIKAFKK IYRADGCFFESLEETLEEYG 

D I P EVKN F I EFCOS PSKRC I ERS I DKQALE EESADK EC/h I ES 

CPn_065l 733975 733517 

tabz -My r istoy I *Acy 1 Carrier Dehydratase 

MNQPC V I K LR EL LDLL PH R Y PF LLVDKVLSYD I EARS I T AQKNVT I NE PFFMCH F PNAP I 
MPGVL ILEALAOAAGVLICLVLEWDRNKR rALFLG IQKAKFRQAVR PGDVLTLOADF.SL I 
S.^Kr/lKAWAOARVnSQl.VTEAELSFALVDKESI 

CPrt_')f,s;'. 7\.\HH() 733990 

I px< ; My r i sr oy I i ; 1 cN.ic Ufuiret y I aze 

KRN:;i [Y(;D;JL:\;FYMf,ERTORTLKREVRYSGVGIHiyJKCSTLHL0PA0TNT^IVF0R0:: 
A:;r 'AH EhA/PALI .PHVY'PPf IR: :'ITL..'iRG.';AV I ATVEH LMAALRSNN I DNL I IQCrr.ZE I P I 

f.;Dr;:;::Nvi^vKLi LVAfi [ceoedkv^ t ar ltp pvyyqhod i flaafp^delk e.':ytlhypq 

:^':Tr'rrOYK:;LV INEErJFROt' I AF^?RTFALYriEUTFU'IEKGLIGGGCLL , >NAVVFKDC* II I 

:;f<' ;oi.RFADt:pvRHK [ luj.u ;ni;:;rv^;npFVAHvLAVGr;GHSSNrAFGKK ilealki. 
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cutE-Apol ipoprotein N-Acety ttronsCer^ 

f JEPVLR T FCR/t.WCL lAFAOPDLGGFVS I LGAACGY^^PfeLEPLKKPSLPLRTLFVS 

CFFW I FT I EG I HFSWMLSDQY IGKL I YLVWLTL IT I LSvtFSGFSCLLVAIVRQKRTAFL 
WSLPCVWVA r EMLRFYG I F3GMS FDY LGWPMTAS AYG RQ FGC FLGWAGQ S FA V I AVNMS F 
YC LLLKK PH AKMLWVLTLLLPYTFGA I HYEYLKHAFQQDKRALRVAWQPAH PP I R PKLK 
sp rwwEOLUjLvrrp iqqp idlli FPEVWPFGKHRCVY PYESCAHLLSSFAPLPEGKAF 
LSNSDCATALSQHFQCPVI IGLERWVKKENVLYWYNSAEVISHKGISVGYOKRILVPGGE 

v t r> n k r . r por . f py y a r rr. k r l pgr r r,rrr/ovnGL prtgiti cye etfgyp losyk 

:•.>.* ;ai:,\; ;;:n:vtu< :wvi \::;\- r . ; -k vi r f m f ;m Lf J r ! : : ■ ;mp* *vp.Af .'nr^r/TA/WOSi/jp. :lk 

; Ul ■Yiyi , HETKAl':;^VLLT:;Ll , Li- , N , i'KTLY(;YC;L-Yi , M i L 1 AFCA'/i; Y 1/J f j( ] F LG Y P. LLAK 
KEIR 

CPn_Q654 737051 736503 

vdlD/yciA-acyl-CoA Thioesterase 

KK I IDFLSWRYYRNQEYPIKILSVESTMLKKKPVSFSCIDGHIYKIFPNDLNANNTVFG 
GLLMSLLDRIJaWAERHTESVCVTAFVDAIJIFYAPAYMGENLICKAAVNRTlTOTSLEVG 
VKVWAEN I Y KQ ERRH I TS A YFT FVAVNEDNQ P I PVHQ I VPET PEEKRR YNEADRRRQARL 
ELK 

CPn_0655 737856 737101 

dnaQ-ONA Pol III Epsilon Chain 

KEIMSLLKDTVFTCLDCEMTGLDVKKDRIIEIAAVRFTFDSVISSIEFLINPERVVSAES 
QRVHH I SNAMLRDQPK IAEVF PQ I KAFFKEGDY I VGHSVGFDLQVLAQEMER IGETFLSK 
YT 1 1 DTLRLAKEYGDS PNNSLESLAVHFNVPYDGNHRAMKDVEININI FKHLCKRFRTLE 
OLKQVLAKP I KMKYMPLGKHKGRCFS EI PLAYLQWASKMDFDSDLLFS I RHE IKHRQKGT 
GFSQVNNPFMEL 

CPn_0656 737842 738048 

No robust homolog present in Genebank/EMBL as of 11/7/98 

THNFLLLPLSLFDILLTVEGFLCLTLYFASVQRMPCEQKRVPGNLYYYYIAAHSSLCLSV 

CKDTMENKD 

CPn_0657 738476 738051 

yjeE (ATPase or Kinase) 

PMGRYRRVSHSSQETU^TEUK3VLVPGAVLLLFGDYGAGKTEFVRGIVSGYLGDTIAE 
EVAS PS FS I LHVYGNE PKRLCHYDLYR I DQKNQEY I FQD AEEDDVLC I EWADRLPKPRFC 

■ DTINIYITMQTNMEREIIIEKR 

CPn_0658 739180 738455 

CT538 hypothetical protein 

fqtWGMD ISGAVKQKIXQFLGKQKKPELIATYLFYI^QALSLRPVVFVRDKI IFKTPEDAV 
R:i^£QDKKIWRCTEIQISSEKPQV^IE^^^(RIYIC^ 

PCfJHEKC^VRIKRFLVSEDPDVIKEYAVPPKEPIIKTVFASAITGKI^HSLPPLLEDFI 
SS^POTLEEVQNCTKFQLESSFLSLLQDALVEDKIAAFIESLADDTAFHVYISQWVDT 

EEr^ 

cffij.0659 739482 739838 

trxA-Thioredoxin 

■ LQfiifojRDSNSIFREGKLMVKIISSENFDSFIASGLVLVDFFA^ 

ELPHVT I G K IN I DENSKPAETY EVSSIPTLTL FKDGNEVARWGLKDKE FLTNL INKHA 

CE»H~0660 740327 739860 

si*d9-rRNA Methylase 

MRV^LHCPD I PQNTGN IGRTCVALGAEL I LVR PLGFSLADKFVKRAGMDYWDKLQLTVVD 
sMSALHWPEDQIFCLSTKGSASYTEFSLPSSGTYVFGSESKGLPKEILKKYYKNCLRI 
P(|QQDI RSLNLATSVG I VLYEWRQKTVALQKNPTV 

CP^-0661 741139 740327 

mip-FKBP-type pept idyl -prolyl cis -trans isomerase 
H3jR©LK I KDRRRKMNRRWNLVI^WALALSVASCDVRSKDKDKDC^SLVFYKDhnCDTNDI 
EL5pNQKtSRTFGHLIJ^QLRKSEDMFFDIA£VAKGLQA£LVCKSAPLTETEYEE3Ch^ / 
QK^WEKKSKENLSLAEKFIJCENSKNAGVVEVQ^ 

KGSF INGQVFSSS EGNNEP ILLPLGQT I PGFALGMI^GMKEGETRVLYIHPDLAYGTAGOK 
PP^LLIFEINLIQASADEVAAVPQEGNQGE 

CP?y}662 742938 741172 

asj£§-Aspartyl tRNA Synthetase , 
SKGGYMKYRTHRCNELTSNH IG ENVQLAGWVHRYRNHGGWF IDLRDRFG ITQIVCREDE 
QPELHQRLDAVRSEWVLSVRGKVC PRLAGMENPNLATGH I EVEVAS FEVLSKSQNIfiPFS I 
ADDH INVNEELRLEYR YLDMRRGDI I EKLLCRHQVMLACRNFMDAQGFTE IVTPyLGKST 
PEGARDYLVPSRIYPGKFYALPQSPQLFKQLLMVGGLDRYFQIATCFRDEDLRApRQPEF 

aq i d i emsfgdtqdll p 1 1 eqlvatl fatqg i e i pl plakmtyqeakds ygtdlfpdlrfd 
lklkdcrdyakrssfsifldqlahggtikgfcvpggatmsrkqldgytefvkhygamglv 
•wiknqegkvasniakfmdeevfhelfayfdakdqdillliaapesvanqsldhlrrliak 
erelysdnqynfwitdfplfsledgkivaehhpftapleediplletdpl/vrsssydl 

VLNGYEIASGSQRIHNPDLOSOIFTILKISPESIQEKFGFFIKALSFGTPaHLGIALGLD 

rlvmvltaaesireviafpktokasdlmmnapseimssqlkelsikvaf j 

CPn_06t>3 744220 742901 

hisJ-Hist idy L tRNA Synthetase 
ksnh ferrhhvtvtlpkgvfd i fpyladakqlwrhts lwhsveka i hiVcml YGFCE I RT 

PIFEKSEVFLHVCEESDWKKEWSFLDRKGRSMTLRPEGTAAVVRS^LEHGASHRSDNK 

fyyilpmfryercxjagryrqhhqfgveaigvrhplrdaevlallwd/ysrvglqhmqiql 
nflggsctrfrydkvlraylkesmgelsalsqqrfstnvlrildsitepedqeiirqappi 
ldyvsdedlkyfneildalrvleipyainprlvrgldyysdlvfbrtttfqevsyalggg 
grydclicafggaslpacgfgvgleraiotllaokriepqfphkZrlipmepdadqfcle 

W.'JQH LR R U \l PT EVDWSHK KVKGALKAASTEQVS FVCL IC ER EZI SOQLV I KNMSLRKEF 

k;tkeevf.orllyeiqntpl 

'.Tn_OM..1 74477S 744557 

Mm mbw-.t homolo-i present in (Jenebank/EMEfL .is; oi 11/7/98 
lwrahamkkl [al [ flvp i kcotnkehdahatvlkaarakynlffvodvfpvhevi ep 

njpix'Lviiyecwv 

•■I'tiJli.i.S V.M'.t'JH V4*;i«^ 

<>\\\a: Hi'Xt)::pht':'.ph.iro Ttr.inuport 

KMm/wrKITiJI'PKH I KR [ EtXJEWKKKYKYWR IR [ vf.'MF UV( IFYYFTRKSFTFAMPTL 
I MAf IFDKAiJl r. I [ « 1 : JTLY F: I Y f > [ fl K FVfjGVM S DQ-ft PR Y FMA {GUM ITCLTN I FFGMSS 

: : i v i . I •* a i ,wwi ; i m w f g< ; w i > pc ah lltm wy a k c e/';tww: ; vw: ;ts h n i cgal I P I ltg f 
i i [;Y:;f ;wi« :amy vi i ; i lc u imglvl inrlrdtp<:/[/;lpp t ekykrdphhahhegksase 
^'iM:K[i:i{Ki.::'i'iu:[f,FTYvr;rNOwr.WFr J AAA:;FK/TY rvRMAVNDWSALFLi etkhyaavk 
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ANFCVSLFEIGCLFGMLV;^^^K I SKGNRGPMNVLF'JLGLLFAI LCMWF3RSHNQWWV 
DGTLLFVIG FFLYG PQMM IGovW ELS Hp KAAGT ASG FTGWF A Y FG AT F ACY P LG KVT DV 
WGWKGFFIALIJ^ASIAIXLFLFTWNATEKNTRSKA 

CPn_0666 746370 /750107 

dnaE-DNA Pol III Alpha 
GFFLTWI PLHCHSQYSVLDAMSS I »6fVAKGQEFG I PALALTOHGNLYGAVDFYKECTOK 
GIOPTtGCECYTAPGSPFDKKKEKR^RAAHHLrLLCKNEOGYPNLCTLTSLAFTEGFYYF 
ri? c t^KDi .lrvY-j E' ; l i ' ' ! .. " '. ■: ^A;.>\:rKA:.;.;j:i^wcL'i>T.:/:fTr/:uiK 

M:JliEJ I AGFKEEWLKvuVV. 1 ; !" -NTAV.'.i-L.VKNU: I IT7ATND I ii V I NANLWQAH 
EILI^JVQSGETVRIAKONTHIP(5PKRKV^'RSREYYFKSPAQMAELFKDrPEVISNTL£VA 
KFCDFTFDFSKKHYPIWPESWO'LNSYTEEDRYOASAVFLKOLAEEALPKKYSSEVLAH 
IAKKFPHRDPIDIVKERMDMEHAII IPKGMCDYLLIVWDI I HWAKANG I PVGPGRGSGAG 
SVtXFLLGITEIEPIRFDLF/ERFINPERLSYPDIDIDICMAGRERVI^AIERHGKDNV 
AQIITFGTMKAKMAVKDVGBTLO^SKVNHIAiCHIPDLNTTt^K^ 
AESAOVIDMALCLEX3S IRMTGVHAAGVI ICGDQLTNHI P ICI SKDSTMITTOYSMKPVES 
VGKIJCVTJLLGLJCTLTSINfAMSAIEKKTGQSLAMATLPLDDATT I FQME 

SKGMQEIJUCNUIPDLFEEIIAMGALYRPGPMDMIPSFINRKHGKEIIEYDHPLMESILKE 
TYG IMVYQEG^Q I AGALASYSLG EG DVLRRAMGKKDFQGWEQEREKFC KRACNNG I DPE 
LAWIFDKMEKTAAYGFNKSHAAAYGLITYTTAYIJCAhrf PKEWLAAL^ 
LIREAQSMGIPILPPHINVSSNHFVATDEGIRFAMGAIKGIGRGLIESIVEERDHHGPYE 
S I RDF IORS DLKKVSKKS I ES L I DAGC FDC FDSNRDLLLASVEPL Y EAI AKDKK EAASGV 
^f^FFTLGAMDRKNE^ICLPKDIPTRSKJ<ELLKKEKELI/;IYLTEHPMDTVRDHLSRI^V 
VLAGEFENLPHGSWRTVF 1 1 DKVTTK I S S KAQKKF AVLRVS DG I DSYELP I WPOMYEEQ 
QELLEEDRL I YALLVLDKRSDS LR I SCRWMKDLS I VNEN 1 1 Y ECDQAFDR I KNQVQKMS F 
TMSTSGKETKAKp^PNENGHTQAIAPVTLSLDLNEUlHSHLC ILKK IVQKH PGSRTLVL 
VFTQDNERVAS^S PDDAYFVCEDI EELRQELVTADLPVRVITV 

CPn_0667 / 751097 750177 

No robust A»omo log present in Genebank/EMBL as ot 11/7/98 
NISIXCKIOKRYFMKKLILYTAAFVASLFCGVFLWDRVPCAQKIbflU^AADHSSEVFSKSC 
RFVRKISG/EELCWERHVSPEaAIJUJPEYiro^ 
HIISQEGEILWSLVNGEMVIiiT<jIWrCSKGF1lECLLLHAGKQDM^ 
SLAQALAlKNIRJ^VIKECQKKKLIFASGNQIGTHFWFQPIRGCTTT 
RHAAVFpAQYS EDRVRHLVKMI FGDNFLIVRS SMVYVFVYK I SLVSADNSVRVEYINAVT 
GKSPQP 

CPnJ5668 751176 752162 

CTSfl hypothetical protein 
wrf/wsprximkfi^yvpi^vlvstgcdakpv^ 

frkallcfgi ithhfprdilrnqaoyligvcyftqdhpdladkafasylol 
pi?a£yse£lfqmkyaiaorfag^krxi%icrlegfpkl^lnadeoalriydeiltafpskdl 
aqalyskaaixratojdlteatktijckltlqfplhii^seafvrls£iylqqakk£phnl 

6YL^FAi<lJ^EEAMKKQHPNHPLNEWSAhrVGAMREK^ 
/YRTAITNYPDTLLVAKCQKRLDRI SKHTS 

CPn_0669 752140 752775 

CT548 hypothetical protein 

I EYLS I LPK I EI NMRLFSLGT I YLF FSLALSSCCG YS I LNS P YHLS S LG KSLLQER I F I A 
P I KEDPHGOLCSALTYELSKRS FAI SGRS SCAGYTLKVELLNG I DKNIG FTYAPNKLGDK 
^Ri^FIVSNEGRLSI^AKVQLIN^^7rQEVLIDC<VARESVDFDFEPDLGTA^IAi^EFAL^ 
F^MHSEAIKSARRILSIRLAETIAQQVYYDLF 

^Pn_0670 752738 753196 

/rsbw-sigma regulatory Cactor-histidine kinase 
PRRLLNRYTMTF FEGETVF PAVLSELH SMLDL I KRAGKQS KC POEKLLKLELACEELLVN 
IISYAYQGENSPGTIAISCISHRGDLEWIKDHGPSFNPLAVSINIQEDLPLEORKLGGL 
GIFLAKSSVDEFLYAREDHCNIVHLKMLNGQHS 

CPn_0671 753660 753205 

CT550 hypothetical protein 

RITINORKYTMSLDFFEEFYHOSIUnCTSFPEGYLNIAEILSYPHCTDANTDFLCSQSD 
NDFII AES KDKLTL FNADF A I WLVP EL VQGQ AVT RGYIAVSQGEGNYEP EMAF EASGQ YN 
QSSLILEALQLYLKDIKDTENALRSFRFNNDH 

CPn_0672 753723 755048 

dacF(pbpS) -D-Ala-D-Ala Caroxypeptidase 

T IKSPHMKRPFFTYLC 1 1 FYGSCASLSLHAGLSFPEVRGATAAWHADSGKVFYDKDI DA 
V I YP ASOTK I AT AL F I LKH Y PTVLDTL I KVKQDA I AS I T PQ AKKQSGYR S P PHWL ETDG S 
TIQLHLREELLGWDLFHALLVCSANIJAAhA^^CCGSVEKFMDKLNFFLKEEIGCTHTH 
FNN PHG LHH PNH YTTT RDL I S I MRC ALK E P P FRGV I STTS YK IG ATNLHG ER I LS PTNKL 
LL PGSTYHYPPALGGKTGTTKTAGKNL I MAAEKNNRLLVT I ATGYSGPVSDLYQDV I ALC 
ETVFNEPLLRKELVPPSDCLQLEIANLGKLSCPLPEGLYYDFYASEDREPLSVSFIAHAD 
AFPIEC^DLLGHWVFYDDEGKKISSQPFYAPCRFERTIKPWKLYMKRVFTSYRTYMSITM 
LLMYFRI RKHRKYKNLKHYSK I 

CPn_0673 755242 755463 

CT552 hypothetical protein 

GKSTEGKAYHC FLKQVS I ALNREEVWDNPHHLMF I LMQFQQFSGEODRFGS FLEAT I RDR 
VSFLVLQEK I ATLK 

CPn.0674 756639 755577 

fmu-RNA Methyltransterase 

RG I LYVTMV PFRQHH A YQ L LKQ LHTS A I S EADRVS Y YFKQNR S LGS KDRQW I QN 1 1 FN I L 
RHRRLLETL I LDSGEQVT PEALVAKVNEGVLENLDS YSA I PWPVRYS ISDDLAHFLVQDY 
GEEQAEEIAKIWLTEAPITIRVTJTDKISVKELQEKLEYPSSPGELPEALHFSKRHPLQST 
EAFRRG FFE IQDENSOR I SW I "LTDKDI VLDFCAGAGGK3L I FAQKAKHW TNDSRKA I 
U3TAKHRLLRACARNFSLAD0LP.LGSFr:WIVDAFCSGTGVFRRHPEHKW0FSKKLLLNY 
'/RVrjKSILKOASAYVGPRGRLWITC:;LLKEENEAHVAYMHSUIWKEVHRKTLPLOVGKG 
DAFFTSHFCK I 

CFn_067 5 757->jL V r ,<.7nfl 

rrriW, hypothetical pror^in 

VPL:JMrLDFOFS IGYYLHVLia./,ir<r/ r\'HU -VYhKKHLLLDAWl'VUlJl't.rrNYCrrilVfm 
P.(JV IHELF^WijAILIY^ Uii'KI.l.A t rFLt'LllliKKf-OTf ;WI.YRLFFr:;KYH [ KKA [VDKLCM 
FKf3L t LFE-^KR PVDK [ VtfAANKVF.IKr ;K:JNF: :: IWKDPn I E*/TV: ^FV'jTPI I PJVCRRLAA 

DAr;L0MaEALrrLLEt;iiTAYLr4/;tja J r i Noi-[f;r.KAOfMyTL.:u-:K::YV!J^t:iJOLFSL 

MAEDFOT II MS I I r.DiZ L:? EVLAt l.'JL I ' it W .TFI If iKTPAJIJMO^PAI A: : PKD: 1KI ^Lf'FL 
AEVLRKVIVEKKLHVSK.'MWITPEE'yiM I Y;' I kf/^NPAl^WDKM tTMI.dMHWI.I.DYDHDIG 
f A f .P. K AA E Y Y N PH P S FW RO F I , P L.'/KJ P V p 
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CPnJ)^>6 7VJ220 758051 

JMCTP^CNrcDRN^SDPLEEGAAEECDSDLEDRVSE^^W^I ET I ADTG I PEAT PSEG 
TNSDLNSDLVDRVEYEARGSLLTTMLARIRKAVSQIWMHVK^ 

LLi^TRLPKETAEPPYFYALETALASCRSFFFHVFLRLFTLLRRQHPEAPLDLCGTDPIS 
PEAAVAFALILRSCCKWATDAVQTC 

YSDKR r DGLANVRGITKI ITSPYLCAGCCVSVVDNLKTYDWR^ 
KCFNFALVMKOrt.Yr.VRQDRrKFI^DFLMMWSEEHASEVNYTJVVLAIL^/NLPILEEDYR 

. ;il ! 1 VK K ! ,r JY7 : ■ *'.'K K' .". ' K !■ i W KI' r'P 
CPn 0677 760410 759256 

No Tobust homolog present in Genebank/ML as of 1U7/98 
RIAMGINPSC^SPDDVWRGAQCDSSSTQGT^ 

^TVREFFLGKKSPDSSCGASGPAMQSPSGPTIRPTRPAPPPPTTGGANAKRPATHGKGR 

SSpotlwgwvia^ 

gsaldrvrenhpnempriwialarei^aav^shatsvrianagkw 
locmkvlsvgawantmtvligdlfe 

CPn_0678 ' 761329 760682 

No robust homolog present in Genebank/EMBL as of 11/7/98 
KIIMSVNPSGNSKNDLWITGAhDQHPDVKESGVTSANLGSHRVTASGGROGLLAKIKF^V 
TCFFSRWSFFRSGAPRGSQOPSAPSAITmSPLPGGDARATEG^R^I^Y^^ 
IPQVPX3GGAQRSSGSTTLKPTRPAPPPPKTGGTNAKRPATHGKGPAPQPPKTGGTNAKRA 
ATHGKGPAPQPPKG I LKQPGQSGTSGKKRVSWSDED 

CPn_0679 762936 761725 

rxik-Phosphogly cerate Kinase 

GYMDKLTVQ DLS PEEKKVLVRVDFNVPMQDGK I LDD I R I RS AMPT INYLLKKHAAV I LMS 
Hl£RPKGCGFQEEYSI£PVVD\T-EGYLGH 

RFH IGEEH PEKDPTFAAELSSYGDFYVNDAFGTSHRKHASVYV^JAFPGRAAAGLLMEK 

EXEFLGRHLLTSPICRPFTAILGGAKISSKIGVIEALLtTOVDYXLLAGGM^ 

I/3NSLVEKSAU3LARN\/LKIAKSRWriVLPSD^ 

FDIGPRTTEEFIRX INQSATVFWNGPVGVYEVPPFDSGS IAIANALGNHPSAVTWGGGD 
AAAWALAGCSTKVSHVSTGGGASLEFLEQGFXPGTEVLSPSKS 

CPn_0680 764254 762971 

ygo4 -Phosphate Permease 

YSMLPL 1 1 FVLLCGFYTSWNIGANDVANAVGPSVGSGVLTLRQAWI AAI FEFFGALLLG 
DRVAGTIESSIVSVTNPMIASGDY>ffGMTAALIATGVWLQU^FFGWVSTTHSIVGAV 
gfS/lgkgti IYWNSVGI ILI SWILSPFMGGCVAYLIFSFIRRHIFYKNDPVLAMVRVA 
PF£AALVIMTI£TVMISGGVILKVSSTP^ 

PKKGSLTTRLKERGGNYGRKYLVVER I FAYLQ 1 1 VAC FMAFAHG SNDVANA I APVAGVLR 
QAWASYTSYTLIRIWAFGGIGLVIGLAIMGWRVIETVGCKITBLTPSRGFSVGMGSALT 
IA^SIU3LPISTTHVVVGAVIX3IGlARGIRAINIi^IIKDIVLSWFITLPAGALI^ILFF 

FAIgALFH 

764258 



765001 



CE&H0681 
CtS9l hypothetical protein 

ngSrshksftrsfrqviiakkailmc/tlarlfgqspfaplqahlemwscveymlpifta 
lrdsryeellemaklvsdkeyqadcik>0 | lrnhlpaglfmpisragileiisiqdsiar)t 

AEWAILLTIRRU^FYPSMETLFFRFLEXNLEAFELTOT 

RliLWGRVAKS EHESDVLQRELMQ IFFSDDF 1 1 PEKEFYLWLQVIRRTAG ISDSSEKLAHR 
INWTLEEK 

CFn_0682 764912 765955 

dppa-ABC ATPase Dipeptide Transport 
TS : kGLHKNSLFRNN^PKRSCKRLMASNPII^ 

qtt&i igesgsgksvsahailrllpcppfsvsgqvnfqghnlltasrsiqkki igteistf 
i fqnpqaslnpvft i eqqfre i ihthlalta£vakek>ilyaleetgfhdprlclnlyph 
lsIj^qriciamallcspklliadefttaldvsvqyoiwllktlokktgmsllii™ 
mg'waetaddvlvlyagrlwecapavqmfhnpshpytrdllasrpswpc^lgsfnpr'^ 
qp|4ytafpsgcryhprcskilnrcsaeapeiypvreghkvrcwlydd 

CPifl)683 765936 766919 

dppffc^ABC ATPase Dipeptide Transport 
GVG£wTTNFPQPLIQATSLTKHYYKRSFWFC£^ 

SG K STLALALAG LLPLTSGFLT FNGT P I K LHS KHG RHQLRSQ VRLVFQN PQ ASpJP RKT I 
LDSLGHSLLYHKLVPKEKVIATVREYLELVGLSEEYFYRYPHQLSGGG^RVJEARALLG 
VPQLI ICDEIVSALDLS IQAQI LNMLAELQKKLSLTYLF ISHDtAVVRSFCTEVFIMYKG 
QIVEKGNTKRIFSDPQHPYTRMLLNAQLPETPDQROSKPIFQEYHKDSEES^STGCYFYN 
RC PQKQEACKSEI I PNQGDAHHTYRCIH 

CPn_0684 768056 767181 

spoJ/parB -Chromosome Partitioning Protein 
EKSGDIVTEEISKDTriEVAIDDIRVSPFQPRRVFSNEELQELIASIMAVGLrHPPWRE 
ICTGDRVLYYELIAGERRWRAMQLAGATTIPVILKHVIAKTrAAEATLIENIORVNLNPI 
EHAEAFKRL I HVFGLTQDKVAYKVGKKRSTVANYLRLLALSKT IOESLLQGQ ITLGHAKV 
ILTLEDPILREKLNEII I0EHLAVREAELIAKQLISEEGSSIELK9TPLDMAESSKQHEE 
LQQRLS DLCGYKVQ I KTRGSKATVSFHLQNTQDLQKLEAWLSS HG/TLS ES LS 

CPn_0685 v 768026 768217 

No robust homolog present in Genebank/EMBL As of 11/7/98 
FP<3SQYLL I FPNRILDLQAFEILDVQGMLTDQRKHIQMLHKH^S I EI FLSNMWEVKLFF 
KTLK 

<;pn_003b 763373 768176 

No robust, homolog present in Genebank/EfcfBL as ot\ 1L/7/98 
AKD. r 5MMPGGRLFRWCJELFFFSiVYVCEQRRPRKLYPSybHLNFPIEKPRFLLKGFKKEL 
HFYHHV 

«*Pn_f]^!J7 768501 760214 

(T4H2 liypor.her.ic.il protein 
UK IHKNLRHAYRFSTPNCR^FMOKLVHN IWKKFYCFGGA I A [C [VLAJFLrJLK [Vl'NTYK 
MSQAK^N^tLLL'rRAAEVAV^FLPSKSALGSLEOAYHUJGESMKPYAGFLAiJCFYniN 
Kl'LW ;AVYA( ilAYmnOALQLPHr EaKLLKElSBROADOLYDVALSKJYOLLQTANSiSPE 
YPTr.:;FLTLU<V[ELKELUiUDV^0DFAALKS3/LFH0FERMY.'JDGEWTLCKRFGKKG 



-f Mr rvr^FROVFF^HSR^^^P^BlLKCNKAlT^ 

FYTN T F PFL E EQF I P A WG v*BP^ P GNAAQ DLMPSHRLKFSETLAF'JDEIF ; > NYM P FCC 

^I^E^smoLA^ 

3 D PTS RKLAADH Y PYS F LLGNT I N RK yKT HNIYRLDtK PMQ YVC PSLFOSSRY LKNW IKE 
KSKQLYLKKQLPKR 

CPn 0689 771407/ 770 H7 

v f hO - N i t r? - re U t ed Am i nnyftvins f e r -is 

^ £ \avb ; va:; :r-:! : :rv\<-r:-v~ V.ra.L«\i."vNr.i.v;:r-F--; :vv:,7:;fae 

HHAWLSWEIAC>RRCSLVKK$\^ 

Se^rydaySvtcac^phlpid^ 
lISlpp^^dw^ 

DKEIALT^K^LEIPG^I LGPS IEEPRGALIGMTIDGAHPLDLGFLLDLRGIAVRT 
GHCCAOPAMERWNVGHVLRJvSLGIYNDEDDIDQFILVLQDSLDKIRR 

CPn.0690 /772704 771436 

ABC~Transporter Membrane Protein w 
LSVUtGDKVLVSIETF^IASGSPVQKAAEACYTQYSKOPSSKEVLSSFSWI^QELSLFPD 

RYNIATGASELIKQHWLHNNHSLAFECILI^KYEPSLS^ 

MQGFDVNKHPLAFLNAVCSEDRGWI YI PEEMQTSDP I FVRH ISFPTVSDHDVIFSPRIV 
VI LGQRAS AQI0 1 SHDVDLEMVGSSKT I VNGVTELFVGEGADLTVFMVPGY5 EEDTLSWS 
TIATVEKDAICRMTONLLESCCXSFC^FDNTSYIVGKKGHAESLVLVQ 
DAEETVSRQNIKS^LYSGHFLFECT I S ISSQGDLSDANQKHDTLLLSSEARVSTFPRLEI 
ETDEVKASHGATj^PLDPQQIFYMRSRGMTEAEAQEICLIHGFLKC/jLVSDTFl^SSFQLN 

CTS 

CPn_0691 / 773467 772685 

CT691 hypothetical protein 
RGLGSMLK yKHLHASCNDVK I LDDFNLN I Q PGTMHV I MG PNG AGKSTLAK I LAGD ESVL V 
SSGEIALG£Q^LSMLP£ERSRAGLFVGFQMPPEIPOTNNKM^ 
SIDEFNTLLST^ETYEWATTDLFLDRNVNEGFSGGERKRNEICQMLVXEPE 

DSGUJvilALRLICRVLEKYRELHPTSS^ 
S LMH EL/EAKSYQ EVT KRVAWR 



0692 



774945 773461 



IQErcATGLKVMGESVKWLEEREDYPYGFVTP I ESQGLTRGLSEET I ^ j*?2£ 
IltFRWAYRYWKQUiEPAWARIitYGPIAYDDIVYFSSPKQKKPUSRI^DADPEIIJ^TFK 

G I PLDEOKRLLNVENV AVDLVFDSVS I GTT FKEALEKAGV IFCS1/3EAIQEH PNLVKK 
GSWSHRD^FAAimAVFSrcSFVYVPKGVKCP^ISTYFRINNKEAGOFERTLrW 

c^GGYASYLEGCTAPAYS SNQLHAAWELVAH EHAV I RYSTVONWYAGDKKTGKGG I YNF 
A^GIXAGYRSKISWSQVEVGAAIT^PSC^ 

'MLHVGKRTTSWISKGISSDESKNTFRSLVSWKKAEHSSNYTC^DSMLICKAS^ 
KIWENSTSSIEHEATTSKLREDQLLYLRSRGLSPEEAVSLVIHGFCREIIEQLPLEFAQ 

EASKLLLIKLENSVG 

0693 776292 775240 

TPR - Repeats (O-Linked GlcNAc Transferase homolog) 

RfeTNHVLGEI SMEEAAKHLAKEFLCSG I NLFL.SG EY EQ AEKRLKETLELDST AALAYCY 
juG 1 1 ALETGRVS EALNWCSKGLASEPGDSYLRYCYGVALDRGNQYEAAI EQYSAYVALH P 
'^DcVeCWFSLGSVYHRLKRLQEALDC FDKI LALDPWN PQS LYNKAVI LS EMDDEAES I RLL 
E^AVAKNPLYWKAWVKLGFIXSRSKRWDKATEAYERVVQLRPDLSDGHYNI^ir^ 
TRLAIJCAFQEALFLNAEDADAHFYVGLAHLJ^LKQMREAYEAFNSALSINLEHERAHYL^ 
YLHHMCGETDKATKELLFLQKKDSTFAPLLQKTWS DPS SMQF ERRLOT I S 

CPn - 0694 779635 776330 

pbp2-PBP2-transglycolase/transpeptidase 

FSDESEAHNIHSMKRPKKFPIYLSIAQKTNRLLSGIVIAFAVIALRLWYLAVVEHEQKLE 
EAYKPQ I RVLPCYVERAT ICDRFGKTLAVNQLQYDVSVAYGA IRDLPTRAWRVDEHGHKQ 
L r PVRKHY I MCLSELLSQELHLDREA I EDAI H AKASVLG SVP Y LVAANVS ERTYLKLKML 
SKWPGLHVEAVVRRHYPQESVASDILGYVGPISLOEYKRVTQELSQU^ECVRAYEEGED 
P KL P EG LAS I DQ VRAL L ES VES NAY S LN AL VG KMGV EACWDSKLRGKIGKKPI LVDRRGN 
FIQEMEGAVPEAPGTKLQLTLSAELQAYADALLLEYEKTETFRSAJCSLKKREKLPPLFPW 
IKGGAIIAIJJPN^EILAMASSPRYR^FVNAKVAEDSKAVRSSIYRWLENKEHIAEIY 
DRKVPLIRERRNPLTGLCYEEILPLTFDCFLDFLFPENSVIKLQUCRNSFVGQAIEVQNL 
VTRLLSLFPYEEGTCPCSAIFDAVFPNEEGHILIQEVISLQEOKWIMECLNfQHKADIEEL 
KEALDQVFNELP.WDKILYTDILRLIVDPE^FSPVLPSEVHRLSLSEFTELCGRYWLR 
SAFST I LEDAF I EVHFKSWRKS EFLQYLAAKRQEEALRKQRYPTPYVDYLEEEKTRQYKM 
FCOEHLDTFLAYLFSKTPYKEGLEPYYDILDLWINELDNGAHRALSWHEHYLFLKERVSH 
LS EHLP ALFSTFREFNELQRPLLGKY PI S I VRNKRQTEQDLAAS FY PVYGYG YLR PHAYG 
OAATLGS IFKLVSAYSVLSQRI LWGHNEEPANPLVI IDKNSFGYRSSKPHVGFFKDGTP I 
PTFFRGGSLPGNrFMGRGFIDLVSALEMSSNPYFSLLVGEGLGDPEDLADAASLFGFGEK 
TGLGLPGEYAGRVTHDIAYNRSGLYATAIGQHTLWTPLQTAVM1ASLVNGGVVYVPKLL 
LGEWEGEHVSYLSSKKKRT I FMPDAWEVLKTGMRNVIWGQYGTARAICS0FPPOLLSR I 
TGKTSTAES IMRVGLDREYGTMKMKDIWFAAVGFSDQDLSLPT IWI WLRLGEFGRDAA 
PHAVKM I DMWEK I0QRESFLRG 

CPn_0695 780201 781382 

homologous to CT695 

SLEVSMKKLLKSALLSAAFAGSVGSLQALPVGNPSDPSLLIDGT IWEGAAGDPCDPCATW 
CDA I SLRAGPj-GPWFDR I LKVDAPKTFSMGAKPTGSAAANYTTAVDRPNPAYNKHLHDA 
EWFTNACF I ALN IWDRFDVFCTLGASNGY I RGNST A FNLVGL FGVKGTTVNANELPNVS L 
SNGWELYTDTSFSWSVGARGALWECGCATLGAEFOYAOSKPKVEELNVICNVSQFSVNK 
PKGYKCVAFPLPTDAGVATATGTKSATINYHEWQVGASLSYRLNSLVPYIGVOWSRATFD 
ADNIRI AQPKLPTAVLMLTAWN PS LLGNAT ALSTTDS FS DFMQIVSCQ I NKFKSRKACGV 
TVG AT LVD ADKWS LTA EAR L INERAAMVSGQFRF 

CPn_0h96 781703 782 V>') 

CTK 'J b hy pm her i o*j 1 p l r>r. i r i 

NfJCFWRMPLLTY^NFE t EVOiTLF^O: I' :KL,T T KDLMr;Ar;AIIFf II 10TRRWNPKMKLY I FEE 
KNr;LY[ rNLAKTLOW'U'NAl.l'HIRKVIODNKTVLFVnTKKOAKf.'VtRl'^AIEAGEFFrAE 
RWL//GMLTNMTriRN. r ; [FTLDK [ FKDf ,: ;^NOAYr J TKK['lAALL.AKHMOKLL.I<NLEt.'i I RYMK 

KAP':LLWvDr^YEKrAVAi*j\KKi/;[i'Vi.ALViyrNf:Di , '['i , [r)HVir , i*NUi). , :i.K::rRLi in 
virvm iFAKHKUi[Er/:'ivK:;ij-:vi'i)L,;;A['M:::::oiMjK:;ijKt:rJHKi'.ni .[^kkfimi-an 



7H ',/|/|7 



!'t'fl 



J)*;HH 7t/)37r> 7701) 

! 1 hyf-nt.hd icil pt ore til 



i:F'ii_()»".'t7 
r L -Kloruj.ir. i^ui I-'m-.toi '[':; 

vii" ;k i m: ;df: imk'I'I .ktlhwi** ;vni .tk* ki:ai j :ai ■< ;r it .kkawy i m ku ;i .a: :ai ;k k khr 

ETK W '. t [ AAKTPANf ITAL I KVMVF-ri'f Jl- VAMNAVKI/KI'V: \t il.LND I I.KN'KVI /I'VKAI - f jAA 
::::OUf'::L:;VDF.LKAVrM f JTViW-:MI l<l::i(VAYI- , I'KATN:T 1 / r UY.';iK*.Ni:K , l'VAI.TMt 



104 



TADSLAKDIAMHWAAQPQFL3KE5VPAEAIAKEKI 
■ * FFQEACLLEGPFtKNADLSIQSLIDDFSKT.TGSSVA 

783443 784201 



♦I 



iKPQEVIEKIVTGKLNT 
IGA 



pyrH-UMP Kinase 

EPNKNMAKOTRRVLFK 1 3CEAL3KDSSNR rDEMRLSRLVSELRAVPJM3IEIALVIGGGN 
I LRCLAEOKELO ENR VfJ ADQMGMLATL INCMAVADALKAED I PC LLTST LSC PQ LADLYT 

pqks r ealdogk r l tcrmAGr; pylttdtgaalpacelnvdvl r katmhvdgvydkdprl 

■■■riv.:."vi-jFv::vKiii-i.::i.''ji; :v?*i.A::#\j.:r.- ■MO:;n:rrRVF::t--L' rf i!:JLKK ALi'Drri'Tr:.. 
V::klvni iv :::!•!•:!( 

GPn_0699 784179 734721 

rrf-Ribosome Releasing Factor 

TMSVLQDTEKKMAAALDFFHKF/K3FRTC 

LRQLVISPYDGNNASAI AKG I IAANLNLQPEVEGS I IRI KVPEPTADYRQEMIKQLRRKC 
EEAKINVRNIRREANDKUCKDSALTEDVVKGNEKKIQELTDKFCKQLDELTKQKEAEIAS 

r 

CPn_0700 785094 785609 

CT676 hypothetical protein 

L>(VHSPTHOCYHCQQPATICVTEIDKDKVIRSYVCATCPCPSHYYNNEHL£LSKGVGVLT 
LECGNCKTVWHSKQDDEQLLGCHQCYT^KNQITSKLKSERWSSSFTMEKGQGSLHIGR 
APGEASrmJPLLKLIAI^EALQDTLEREDYEQAAVIRDQINHLKTKNPDDPS 

CPn_0701 785584 736672 

karG-Arginine Kinase 

KPKIQOTLPNDLLETLVKRKESPQANKVWPVTTFSLAR^ILSVSKFLPCLSKEQKLEILCF 
ITSHFNHIEGFGEFIVLPLKDTPLWQKEFLLEHFLLPYDLVGNPEGEALWSRSGDFLAA 
INFQDHLVLHGIDFCGNVEKTLD0LVQU3SYLHSKLSFAFSSEFGFLTTNPKNCGTGLKS 
C^FLHIPALLYSKEFTNLIDEEV^IITSSLLl^WTGFPGNIVVl^NRCSl^LTEELLLSS 
LRITASKLSVA£VAAKKRLSEE^SGDLK>JLILRSI^IXTHSCQLELK£TLDALSWIQLGI 
DLGL IKVTENHPLWNPLFWQ IRRAHLALQKQAEDSRDLQKDT I SHLRASVLKELTKGLS P 
ESF 

CPn_0702 789700 786929 

yscC/gspD-Yop C/Gen Secretion Protein D 
LKKNPVKTVILNIGRXILCGIKKKKKKIGILSGLFFIX)LV^ 

RDEKIJ^CPKNSAASLSAKKSHTKKTTPGSIPSKVFSKFDATQDKTFQKTSGSAFPAKPT 
TLKELEERKKPRPERRTTADVKRS PRFLPTQEVEEPVPAASKEQLDS IQVWEEKQNYARR 
AVNAINLS I KXQLEEQT STVT EKDVQ P KT Q AT PHAS KKNVAS PSTSMPG IEKAATTVAVP 
QDK5EEEKVKERLTICRELTCEDLKDNGYTVNFEDIS ILELXCFVSKISGTNFVFDSNDLQ 
Flg'IVSHDPTSVDDLST II^VUCMHDLKVVEQGNNVL I YRNPHLSKLSTWTDSSLKE 
TCKAWVTRVFRLYSVSPSAAVNI IQPLLSHDAIVSASEATRHVI ISDIAGNVDKVSDLL 
AAitDC PGT S VDMT EYEVKY ANP AALVSYC QDVLGTLAEDDAFQMF I QPGTNK I FWS S P R 
LAISKAEQIXKSLDWEMAHTLJ3DPASTALALGGTGTTSP 

LQJHGYNLYVTTAMDEDF INTLNS IQWLEVNNS IV I IGNQGNVDRVIGLLNGLDLP PKQV / 
Y rEYLIOTSLEKSWDFXJVQWVALGDEQSKVAYASGLLNNTG I ATPTKATVP PGTPNPGS/ 
I PEPTPGQLTGFSDMLNS S SAFGLG I IGNVT^HKGKSFLTLGGLI^ALDQDGDTVIVLNP| 
RiW^QDTQQASFFVGQTVPYCTIW 

QIE&TI S ELHSASGSLT PVTDKTY AATRLQ I P DGC F LVMSGH I RDKTTKWSGVPLLNS I V 
P&S&GLFSRT IDQRQKRN I MMF IKPKV I S S FEEGTRVTNKEGYRYNWEADEGSMQVAPRH ) 
AP ECQGPPSLQAESDFK 1 1 EI EAQ 

• ?J3 

CPH^0703 791205 789685 

pto-5/T Protein Kinase 

rkigf>iixrggiplpepqviggyhvkkilskklrsrvvhgij4petrhstvikvfsp3t , sf 
tsrsvynflkeaqslhqithpnivkfhrygkwqix:lyiameyiegisij^eyilaqf/islp 
qaipiifdiaqalehlhsrijilhki)ikpenilitpcgkiklidfglai>n7reiqrahpsv 

IGTPYYMSPEQRQGESHSPASDIYAI^LI^YELILGHLSI^RVFLSLVPERISKZLWCAL 

qp.spnnrysstrefiqdihhyrksgdmqedlrikdhtvalyeql^qrfwlap/tlrfpd 
ffsgvlyhcgyplyphayotllegdvfm,wi^yspisnatialsvvkslvcc^/dlqrpll 
drv&eineclirmki p i demg i s i lclei skenkelswi acgktvfwi krqorwqdfes 
f^pgi^kitsi^iretkvavreigdeawctleleesvaslktlslaewdrfiqkaifcpi 
e^ehgg i qs rqhg sns pstl i slkr i r 

CPtp0704 792330 791209 

fl'iM- Flagellar Motor Switch Domain/YscQ family/ 
RY^^VAADSSASWLKSRhnsfFLSSLGKTEEQVAAPEFPKEIXQHKIHfEKFRLEDVOVSIK 
FRGSITAVEATKEFGVHLLIQPhWVQPWEVENLLFLTSEEDIX3ELMVAVFDDASLASYFY 
EKDKLLGFHYYFVAEACKLFEELCWPSLSAKVGGDAIFTATSLQGSFQWDISLRLTCK 
NVRCRLLLPEDTFQSCQKFFSGLHDESDLHNIDQTC^ISLSVE^YSQLTQEEWHQVVPG 
S F IMLDSC L YDP ET EESGALLTVQKHQF FGGRFLT PSSG EFK ITS Y PNLTHEDPPL PENP 
QASAAPLPGYSRLVVEVARYSLAVSEFIKLNLGSILSLGNHP^GVDIILDGAKVGRGEI 
IALGDVLGIRVLEV 

CPn_07fJ5 793176 792334 

CT671 hypothetical protein 
FMELKKTAESLYSAKTDNHTVYQNSPEPRDSRDVKVFS/EGKQTROEKTTSSKGNTRTES 
RKFADEEKRVDDEIAEVGSKEEEQESQEFCLAENAFAOfMSLIDIAAAGSAEAWEVAPIA 
VS S I DTQWI EN I I LSTVESMV I S E INGEQ LVELVLDASS SVPEAFVGANLTLVQSGQDLS 
VKFS S FVDATQKAEAADLVTNN PSQ LSS LVS ALKGHQLTLKE FS VGNLLVQL PK I E EVQT 
PLHMIA5TIRHREEKDQRDQNQKQKQDDKEQDSYK/EEARL 

CPn_07Q6 793689 793180/ 

CT670 hypothetical protein 
YAVAKYPLEPVLAIKKDRVDRAEKWKEKRRL^EIEQEKLREKEAERDKVKNHYMOKZQO 
LRDLLDEGTTCDAVLQI KS Y I KWAVQLS EEEEKVNKQK EWLAAS KELEKAEVNLAKRR 
KEEEKTRLHKEEWHKEALKEEARAEEKEQDEMGQLLFQLRQKKKRESGGS 

C?nJ)Hn 7')S0.3S- 7/3704 

y:;cN-7op N ( F luge I la t -Type /iTPdse) 

WIMDQLTTDFDTI -MIjQI i tDVNLTTWOR I TEWGML [KAWPNVRVi^EVCLVKRNGMEPL 
VT EWG FTQ: 'FAFLJ PU'.ELJGVJ PS6 EV I PTGLPLH I RAGMf XUI F< V LNG LGEPt DVET 

k< ;ri/jrryivn-'i* [FRAPrDruiRAKZR0iLi;*n ;vrc idgmltvakv.ok ic ifagagvgks 

:;L.U;M [ AKNAKKAtJVNV LAL h lERiGREVREF I E( iDl/lEE' jMKR: IV I W:"T.'.;DQSL>QLRLN 
AAYVCTA I A EY F l> IXJ* KTWt-MMjfcivTR F ARALR EV'J LAACE P \ 'AH AiJYTPUV F3TLPR L 
f.ER:/;A:;i)Kt!T TTAFYTVI .VAt;Cn)MNEPVADEVK:1 1 LCIH rVE,::NALAOAYHYPA IDVLA 

:; r :;rllta r vpkkcjrr [ i< ikahIvlakykaneml i p. fieyprc ;pke i dfa i dh i dklnr 
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1 H 
rT66H hypothet :cj t ^^^V r> 

AFKTVKRFFC FM I DPVECn^PlDAEAQ/ rTQN:* JTFUV? ELKKD [ P FALGJYAAPKD 
TTL\WFKPNPMAM«D0N:;tJLIDPEL0^E5EELOEU[NNLKCR[>n:FRGTFEDSOT 
AOFADEHTOAVWIIDLINEDL^rAE^CX3DARKEDKEEG^VTRKIIDWV3SGEEVLNR 
ALLYFSDRDGNRE3LANFLK-/QYAVQRAT0RAELFAJ I VGT3VSSVKT IMTTQLC 

CPn_0709 7M203 / 71S742 

CT^67 hypothet ioil pror.e/n 

:iiKy;?;.F:S^ =--.\i"*:..:::nJM: . 

QV I ANCK I ESTRALAQS VLWffjPTLVAKJAU PUJ 

CPn_0710 79^482 796210 

CT666 hypothetical tfrotein 

RS RGEKSMATNKSCTAFDBNKMLDGVCTYVKGVQQY LT ELETSTQGTVD tXTTMFNLQ FRM 
Q I LSQYMESVSN I LTAV^TEM ITMARAVKGS 

CPn_0711 / 796791 796486 

CT665 hypotheti/al protein 

TTI^VU3FINYL/LGRYSMFTJMEm , AKEEKNSQPLLDLEQD^DHDRAQEl^^ 
VH KLHALLREGSDI^ES FGQQQS LLAGYVALOKVLGR I NRKM I 

CPn_0712 / 799315 796781 

FHA domain .-/homology to adenylate cyclase) 

MAVRL r VDEG/lSGVI FVLEEG I SWS IGRDS S AND I P I EDPKLGASQA 1 1 NKTDGSYY I T 
NLDDTIPIVVNGVAIQETTQUCNEITriLLGSNQYSFLSDEFDPQDLVYDFDIPEENFSND 
SGDL^DSNHQGKDLEPRC/TSETNHSPKPKEKLTKDQGSSDPITSGDQEIJ^AFLASAKAE 
KNOPRAKT^RKKGLKESSNESLNPKEQNAKDSPKGEERTNKPQNAIMEDNGASPRQDPQPK 
SAEPSLKOTARDETPLKENKPVEEKANXKATPDSPEKKDQPEEGSKKEGSKIEATPLDSQ 
KESEDKiy^EAfVQEEEENLTEDNKEDSDSAADANDDTASDHTAEDNKETPKKVENEKSA 
VLSPFtT/QDLFRFDCT I FPAEI DDI AKKNI SVDLTQ PS RFLLKVLAGAN IGAEFHLDSG K 
TYILOTDPTTCDIVFNDLSVSHQHAKITVGlJDGGILIEDLDSK>Kr/IVEGRKIDCT 
SNQWAUrmJLLIDHHAPADTIVASLSPDDYSLFGRQQDAEALERQEAQEEEEKQKRA 
TLRAGSFILTLFVGGLAILFGIGTASLFHTKE^PLENIDYQEDLAQVI^FPTVRYTFN 
...lfeQLFLIGHVKNSTDKSEIX.YKVDALSFVKSVDDNVIDDEAVV^EMNILLSKRPEFKG 
li(MHSPEPGKFIITGYVKTEEQAACLVDYL^IHFNYl^LLEbn<VV\^^ 
GFANIHVAFV^EVILTGYVNNDDAEKFRAWOEI-SGIPGVRLVKJJFA I ID 

RYPNRYRVTGYSRYGEIS I^AAA^GRILTRGDVIDG^f^VTS IQPNAIFLEKEGLKYK 
f IDYNK 

CPn_0713 799817 799332 

CT663 hypothetical protein 

LDUCEEKAGFRNEIVS I PCGTKTTIAALENTSMLEKLIKNFATYMGITSTLELDADGAYV 
LP I SEVVKVRAQQNADNEIVLSASLGALPPSADTAKLYLQMM IGNLFGRETGGSALGLDS 
EGrAA^VRRFSGDTTYDDFVRHVESIWFSETWLSDLGLGKQ 

CPn_0714 801125 800091 

hemA-Glutamyl tRNA Reductase 

NYRIVLMVLGWGISYREAAUCERERAIQYLQSFEKJJLFIJ^QRF^ 

ELYYYSESPE I AQAALLSELTSQG I RPYRHRGLSCFTHLFQVTSGI DSL I FG ETEIQGQV 

KRAYUCGSKERELPFDLHFLFQKAIJCEX3KEYRSRIGFPDHQVTIESVVQEILLSYDKSIY 

TNFLFVGYSDINRXVAAYLYQHGYHRITFCSRQQVTAPYRTLSRETLSFRQPYDVIFFGS 

SESASQ FSDLSCESLAS I PKR I VFDFNVPRTFLWKETPTG FVYLD I DF I SECVQKRLQCT 

KEGVNKAKLLLTC AAKKQWE I YEKKSSH ITQRQ ISSPRI PSVLS Y 

CPn_0715 801636 803462 

gyrl-DNA Gyrase Subunit B 

KFNKI SHMAAYTEAS I LSLASLDH I RLRAGMY IGRLGMGSQKEDG I YTLFKEWDNG I DE 
F IMGHGKSLKI S ASDKQI S IODQGRG I PLGKL I DCVSK I NTGAKYTQDVFHFSVGLNGVG 
tJCAVNALSEIFSVRSVRKKKY>ILATFHRGVIjQESKCGSTKDPDGTFVSFTPDPSIFPEFT 
FNHDFLKDKI RQ YTYLHSGLEI RFNDEVF I SHNGLKDLFDAE ITEP PLYS PLFFQNEDLT 
FIFSHLEGNTERYFSFVNGOETL 1 DGGTHLTAFKEAIVKGVNEFFGKTFVSNDIREGIVGC 
IAIKIASPIFES^KNKLGNTOIRSSLIKDVKEAIVOALRKDKVAPELLLEKIKFNEKTR 
KNIQFIKQDLKSKQKKVHYKIPKIJIDCKFHYNDRSLYGEASSIFLTEGESASASILASRN 
PLTQAVFSLRGK PMNVF S LEETKMYKNDEL FYLATALG I TQNE IQH LRYNKV I LATDADV 
DGMHIRNLLITFFLKTLLPLVTl^HLFILCTPLFKV^ 

KDSSLEITRFKGLGEISPKEFAAFIGPEIRLTPVTITSLESISSILQFYMGKNTKERKQF 
IMDNLITDF 

CPn_0716 803466 804902 

gyrA-DNA Gyrase Subunit A 

FMRDVSELFRTHFMHYASYVI LERAI PH I LDGLKPVQRRLLWTLFLMDDGKMHKVANI AG 
RTMALH PHG DAP I VEALWLANKGYL I DTQGNFGNPLTG DPH AAAR Y I EARLS PLARETL 
FNTDL I AFHDSYDGREKEPD I LPAKLPVLLLHGVDG I AVGMTTK I F PHNFAELLKAQ I A I 
LNDKKFTVFPDFPSGGLMDPSEYQDGLGSITLRASIDI INDKTLWKOICPQSTTETLIR 
SIETIAAKRGTIKIDTIODFSTDVPHI EIKLPKGSRAKEMLPLLFEHTECQVILYSKPTVI 
YENKPVECSISEILKLHTTALQGYLEKELLLLOEQLTLDHYHKTLEYIFIKHKLYDSVRE 
VLAINKK ISADDLHOAVLH ALEPWLHELATPVTKODTSOLASLT I KK ILCFNEEACTKEL 
LAIEKKOAAIQKDLORIKEVTVKYLKGLLERHGHLGEP.KTOITNFKTAKTSILKQOTLI 

CPn_07L7 80-1968 905306 

CT656 hypothetical protein 

I R IKFI DTITIWRMEPRH I YI RKPETPKAPDVEKPGVPEYMTMANTPTFEGPVKTLDQL 
RRAL I EQRGAEEGCKMYDN F I QS I L I STFG LVH KDMDRAQKASKRMRSVY K EQ 

CPn_07lB A05300 80562b 

CTf;57 hypothec icri I protein 

RA'/MflFTYFLALPVDRLMOERFLCGPKRWAPFINSPLYLTLIADHDTrYLAKNLDKFPLP 
VEOWEKTVtJfVnSLLKS IFUTSDLSSLRLLACTKFEI LTUJDLYUAON [ 

CPrt_07l» Mii^K77 H()6ft'»0 

;;£nh ( I'nirtuJcmt ul i no Synr h.ist*) 

FDTKLVKKV [K r^MKTVTSFTVCKLlN^IRLUKYLTEVUr-KYrJRAFYLH-H [LlXLVQtNGQ 
iriT!'7A , l , Ut.MCf;n[\TrDrCEKEEU.ELLPl^Ari , L[>K'/ , ^EC/;MtrA/ LNKl'RDHWHPAPO 
WVW JTI, VI I AI . f .!![•* I/i Hkl. KKF.FPKKl'WR I i t VHKLDKr/r'/jL I tTAKTUOAKKVFSELFG 

■lrr'IyK::Y^AV^Ml:Kl■h:7plUlmM::KMgNKRK[■K^7r;:;v;K^^^r^llrovtAKNl'.KL^;FV 

L:;i.KA<;Lri:ijMi<::Ki.iK[:FKNrn , iL.NKNi.[.i::;ri J Kr-:o 

<:it. j vim) i& HOt.M'ii hov;m'. 

hyj « .1 h"l' i.'.j ! pt . il <: in 
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LR [TMKEFLAY [ t KNLVDR PEEVR I K EVQGTHT 1 1 Y^^B'DIGKI IGKECRTIKAIR 
TLLV!"rVA:"RNNVRV3 LEI MEEK 

r;Pn_072L 907S7L 809489 

kdsA-KDO Synthetase 

KRMVMFNNKM I L I AGPCVI ECED ITLE IAGKLQ3 1 LAPYSDR IQWFFK3SY DKANRSSLN 
' J F RG PC LT ECLR I LAKVK ETFG VG I LTDVHTPQDAYAAAEVCN I LQ7PAFLCR0TDLLVA 
TAETGA T VNf .KKGOFL^ FV/mFr,P Tf ir/I .:rrnrrriK 1 1 JTFsRGC^FGYT ITfl/Z^DMRS r PVL 

: :io ;r ! -v s fj atm. :v-r .1 * ; as .: :tk: :•' v w .tkfv; :-:..:i-.\Ai./v\. ;am/;:.:- • ;•::•« iTNPK [AK3 
uAA;;M; J JLii^['V^LLri'wbVLi- , ; , ':v::.:;- , !;Mv.'A 

CPnJ)722 908477 908974 

YGLSWTK^CGLFYSLGLL^ 

SKQICSPEERFFHCKIDKSCMELHFPQSSYSCKEYLTRISGHILTQNFEKQMQFRGNSGL 
LNYQDGSLHVYDCRFQVDPVPGYGSPDKEDSSSGGMKTLYLSLFRN 

CPn_0723 808978 809703 

yhbG-ABC Transporter ATPase 

ASMPILSVC^VKKYNKKPVTNDVSFQINPGEIVGLLGP^AGKTTAFfLTVGLIRPDSG 
K 1 1 FKNVDVTKKTMDHRARLG IGYLAQEPT I FKELTVQDNL IC I LE 1 I YKARKQQS HLLN 
TLVDDLOLGSCLHKKAGTLSGGERRJILEIACVLAI^PSVLLLDEPFANVDPLVIQNVKYL 
IKILAGRGIGILITDHNAKELLSIADRCYLIIDGKIFFEGSSSOMISNPMVKQHYLGDSF 

SY 

CPn_0724 810602 809706 

No robust homolog present in Genebank/EMBL as of 11/7/98 
RTSTRLDYRSGC I LS K I LP FP ELWKMLLGFLCDC PC ASWQCAAVANCYDSVFMSRP EHKP 
NIPYITKATRRGLSMKTLAYIJtfUaJARQm 

FFYGCSNIEDILEEMRRPHRILLLGFSYCQKPKACPEGRFNDACRYDPSHPTCASCSIGT 
MMRLNARRYTTV I IPTFIDIAKHLHTLKKRYPGYQ I LF AVT AC ELS LKMFGD Y ASVMNLK 
GVGIRLTGRICNTFKAFKLAERGWPG^ILEEECFEVLARILTEYSSAPFPRDFCEIH 

CPn_0725 810829 810587 

CT652.1 hypothetical protein 
SCGDVGMFFAPLLYESLR*Gl>lHPTSHMa}QL 
GLTTIKAIAEEVLSDDEPLLD 



ilSPllp 



CPn_0726 



813384 810880 



CT620 hypothetical protein 

apidmiystsistfykklslvssmhsfaqrhreslehianyekttaerbiijcrlie^dq 
rXseryrsaveklhkyeveratvaks I PVAAI hekplssthasvqvtast paatgsgvga 

YYNAVKQKWAQ D L I VELNTVMTT I MASVN S KN P ANKDVF DKU7TELQALVAAGNNLT EEN 

F^LYNFPEEIFTAIORAIrrFTGG^IKTDFTNOI^GKYC^QATLTCTFADGR^^FKDIL^ 

AVQGVLTPEQFTIFAEIATELQALADHVGNFDEAGLQRIEDAGEKLAAVINSSDLTRNDK^ 

I^0HITDLYSO3VAALGSFDTVI^A5IYVN0HQGTMFSNLSSFVGSLIGTFAPIDLSS 

SQGDI S SAALAGALQTARGLNS RFNELTAEQQKL INEC I KSLVTFKCGEHLGAIWAYFTA^- 

SWAL^PTATMDHVKAAILEEAKELDNSSFQLASSIKSA^SIWSSGSFSVTVNSSTU 

qtfei ys ekngkveinq illnygstgflpe itk1aktna£starsyfrfkalaavesenv|, 
nk^dlqsqlc£ftnmktelfdgqijlsqaselralplpsava 
yteyysnlgss igns i idaisqyvngatyfnfasyvgqqpavgagganafpgsqesaqa 
ktoqerkqaalylqetrgaltvi eeqrarvlkddkitneqrst ildslrnyednins i sg 
sPCLWlqplsiaggsvagtfevkegqec^^ 
sfjq s dqq s f admgqnfqldlqmh lt smoq ewtwat slqllnqmyls larsltg 

ill 

CPn_0727 813559 816192 

C$619 hypothetical protein 

KYYLFSMSTFSIQNRLRTISGESTRIIKLDHKYSGFDPRSVPA1NLEELNSGIYALR 
NA^SENTNVAALLNPNNT I FPTTSWTDYKHSRPQASS PRAPSSQT PTD IVSAAALJCLVL 
V I DGGLAELVAS VTE I DLG ALST ISTVRQLMASYLGLTT LTAEQEKWFS S S YVP9EKNL 
LEHVKQ EKAAE I QAKQEEI KAVLEAKGVSTEE I EAI LKEYPD IYAADFFKEF I EBPLHTY 
RjyCVGAPIQE^ENAIQLLPTPPAITPDNVNEVhKSMNTLSTILQAIDDAIKQA 
E f I T I LGTLVPLVDKTTFTKAEFDL I YTATQLPNTAS LKLYLTDRQ I AEYRGKITKVYQN 
S I QNLS ETKRWENNRSMLETQLSMFOQAQNC FVTWI SQANALNI A ITNKYI^AVLTTSM 
EWC^LLCLSYWERLADDEKAIFDKSVNE^LPIHIWGGSWV^ 
LS2&VT SQDQ I KAYLQTRGNEFKATRH FF HNI GDQMYQF ANETVFG^LT/ANGAIQPDL 
GG ^IREAMTNVGTVEADYVSNAQR I IJJEFNTAATAHVl^LQLQ I AELQKHADDLDPGKAS 
ftAkfavaawitseslgdalismi LNSQLPKQEAFLKPLI EEINFNNLAANALNSLLQ 
ITNEFSTTSVYYSLSSYLVQSKTGONLFAGDYYETLLAAAREREYIYRCTARCKQAINLV 
NGLLQKINSLPGATSAQKQEMLNATTYYQYSLSVTLNQLTVLESLLAdLKMTI^TTSNNK 
YDKSVFKI ESFDDWI PTLAALESFLTSGFPNI SATGGLGPLFTQVQ/IXXJTYTSQGOTQQ 
LNLQNQMTT IQQEWTLVSTSMQVLNGILSQLAGAIYSN 

CPn_0728 818483 816525 

CHLPN 76kDa Homolog (CT622) 
VFMVNP IGPGP I DETERTPPADLSAQGLEASAANKSAEAOR LAGAEAKPKESKTDSVERW 
SILRSAVNAU^$IvU3KLGIASSNSSSSTSRSADVDSTTATA^PPFPTFDDYKTQAOTAY 
DT I FTGTS LAD I QAALVS LQ DAVTN I KDT AATDEET A I AA^WETKNADAVKVGAQ I T ELA 
KYA3DNQAILDSL^KLTSFDLLQAALL0SVANNNKAAELjfl<EM0DNPVVPGKTPAIAQSL 
VDQTDATATQ I EKDGN A I R DAY F ACQN ASGAVENAK3 NWS I SN I DSAKAA I ATAKTQ I A E 
AQKKFPDS P I LQEAEQMVIQAEKDLKNI KPADGSDVPNPGTTVGGSKQQGSS ICS I RVSM 
LLDDAENETAS I I^SGFRQMIHMFOTEWPDSQAAC^e/aAQAPvAAKAAGDDSAAAALADA 
QKALEAALGKAGQQOG I LNALGQ I ASAAWSACVPPAAASS ICSSVKQLYKTSKSTGSDY 
KTQ I S AG YDA Y KS T NDAYG RARNDATRDV I r^ST/ALTRSVPRARTEARGPEKTDQALA 
RV rSGNSRTLCDVYSQVSALQSVMQI IQSNPQANMEEIRQKLTGAVTKPPQFCYPYVQLS 
NDST0KFTAKLESLFAECSRTAAEIKALSFETN9LFIQQVLVNIGSLYSGYLQ 

C[-n_U72') 819905 fi 185^6 

CHLPN 7(,kDn Homo Ion {CTf',2 31 
PAWSSVoTLN I DTKDTMKKOVYQWLASWL/ALT ISGYAF.LPL.'IEOKVKSHTYTTLDEVK 
DYL. r ;KRGFVETKKQDCVLR [ AC DVRA RWLVTR ED I Ki J PS D K DK YN P L PVN R Y RS EFY L Y I 
I )Y R A ER NW l.f>i K MNV/T A [ AU 5 ENT AAG Vpf I N R A F LG Y R F Y KM P ET RT D F FM E [ G R 3G LG D 

i ,f e:; EVfjFo: ;nf lx ;lh t ywtr el:: k dy uyo v r vi igg p fwnmt kki i v awwe< * i lnrlp k 
oi-rvKc:';vvLwn^r^F/r::rrFJ<AA^ 

f;AFI.Mrn'IAKA1 , KrrLW;KF^LAWF/k;TU7;LRKA^I^:;A'I , VIA , KYVEALr;VPEIDV. f :(; 
[ ( ; Rf INI . I .K KWF AO A I AANY PPK EAf^FTN Y Kf F: : AL YMY< I T T r /: I L: : K RAY( J A Y 'JKPANDK 

u/ion'i-'KKFDu;! t:;AF 
':tTij)7 io >< j it, 

mviN I fit.t;i|i a L Mcml)i.Hit/ f'lot t 

::<:kk< v v ;kn< :[.M:'.RK[jMKV:;tAn:UFNiL:7;'l'!-Y::;i' it* ; I r-'RRi amatyi-yjadp ivaafw 



'/ 14 
li -FRTVFFLPK I LGGL I L!^«hFEFLRAC^bRA/\FFFRRF:;R L I KG^T I t FTLLI E 
AWWWLOWEECTVDM I WKKlFCC t FLMwYNVNGALLHCENKFFG'/GLAPWVNI I 
W I FFVr AARHSDPRER I IGL3VALV ICFFFEWL [TVrGVWKFLLEAKjPPQEHDSVRALL 
APLSLGILTSj I FOUJLUJD ICLARVVHE l5f LYLMY5LK I YQLP I HLFGFGVFTVLLPA 
ISP^REDHERGLKLMKFVLTLTMSVMI^ 

VRVLRGYGA3 1 1 PMALAPLVjVLF'Y'AORQ/rAVPLr IG IGTALAN I VLSLVLGRWVLKDVS 
GI TYAT3 rTAW\/QLYFLW*fYrjSKRLPMY^KLLWEo TRRS tKVMGTTMLACMITLGLNILT 
QTTr'VrFLNPLTPtAVJPLr;'* TTAOAT A^-HESC rFLAFLFGFAKLLRVEDLINLASFEYW 

CPn_U7Jl 9214J4 / SJtVbJ 

No robust homolog presen/ in Genebank/EMBL as of 11/7/98 
VA I A I S RN I PVI RLQr/PDN I LKJERAKETS LSFLL I KPFS P PPLKQDYLFD I SPYTSS E 
IT IGGSYFKLNKASLCSSTLRL^S I S 1 1 S 

CPn_0732- 822/092 822976 

nfo-Endonuclease IVy 

NFMKVLPPPS I PLLGAHTs/aGGLKNAI YEGRD IGASTVQ I FT ANQRQWQRRALKEEVIE 
DFKAALKETDLSY IMSHaZyLINPGAPDPVILEKSR IGIYQEILDC ITLGISFVNFHPGA 
ALKSSKEDCMNK I VSS FK}SAP LFDS S P PL WIXETTAGQGT L I GS MFE ELG YLVQNLKN 
Q I PIGVCVDTCH I FAA^D ITS PQGWEDVLNEFDEYVGLSYLRAFHLNDSMF PLGANKDR 
HAPLGEGYIGKESFKSfeDERTRKIPKYLETPGGPENWQKEIGELLKFSKNRDS 



CPn_0733 



823739 823101 



rs4-S4 Riboso/hal Protein 
GLKYMARYCGPiofevARRFGANIFGRSRNPLLKKPHPPGQHGMQRKKKSDYGLOLEEKQK 
LKACYGMIMEK&VKAFKEVI HKQGNVAQMFLERFECRLDNMVYRMGFAKT I FAAOQLVA 
HGHILVNGRR^RRSFFLRPC^QISLKEKSKRLQSVKDALESKDESSLPSYISLDKTGFK 
GELLVS PEQ^Q I EAQLPLP INI SWCEFLSH RT 

CPA_073y 

yCeA / 

QNTKEHFSShraiFLQ£NYFQDYVRVFIMEKKYT 

DVSCR^Y IS EQG INGQ FSGYEPHAELYMQWLKERPNFSK IKFKIHH IKENI FPRITVKYR 

keiaSLgcevdlskoakhistoevweklqenrclildvri^e^ 

R E F BEY AEKLAQ EC DP ETT PVMMYCTGG I RC ELYS PVLL EKG FKEVYQLDGG V I AYGCOV 
GTGKWLGKLFVFDDRLAI PIDESDPDVAP I AECCHCQTPSDAYYNC ANTDCNALFLCCDE 
C I^HQGCCGEECSQS PRVRKFDSSRGNKPFRRAHLCE ISENSES ASCCL I 

/Pn_0735 825680 825003 

^Uridine Kinase (Uridine Monophosphokinase) (Pyrimidine 
■ Ribonucleoside Kinase) . 
GEKFM^frOJWIIGITGGSGAGKTTLT^ 

LIWDHPDAFDNDLLISDrKRLKNNEIVQAPVFDFVLGNRSKTEIETIYPSKVILVEGILV 
FENQELRDLMDI RIFVDTDADERILRRMVRDVQEQGDSVDC IMSRYLSMVKPMHEKF I EP 
TRKYADI IVHGNYRQNWTNILSQKI KKHLENALESDETYYMVNSK 

CPn_0736 827731 825992 

ygeD-Efflux Protein 

RGELLKIJ^CCLVAFMTVSVKKKSFRALVTTHFLTII^NLYKFI^ 
KILSCVSFFFALPFTjlXAPLAGSLADRFQKRNIILATRFIEILCTIUTTYFFFIQSVVGG 
YWL IL^HTT I FGPAKI/3 1 LPEMLPSEQLSQAiTC IMT^ 
HRLGVNSYVWPTIiTCVIVSIISTLISFCIRPSN\nQ^VKQKITLVSFKDLWKV^^ 
YLTVS I FLGSFFLLIGAYTQLE 1 1 PFVEFTLKYPKHYGAYLF P IVALGVGTG SY ITGKIS 
GKDIKIC^VPLAAIGIJVLVFMGLYAFACSILFVLFFLI-AL^FLGGVYQVPI^AYVQYASP 
EHKRGQ ILAANNFLDFFGVLVAAGVI RVLGSNLGLS PETS FFY IGWFVLAVS IWTLWIWR 
EHVYRLLLG IILRRQLGYYIJCIHQSSSPKCYFVAVQSYREIRRVIAALTKTVRSRVI ILD 
QKLVPGWRAWLLSWCVPTWS S VRDNDS EAQD AWAVLO ANH LKT S LKKF PDVS WC LGLP 
KNVERFTS I LQEQGI DLHP IQLVQKEGKKRVIYTLVFPHA 

CPn_0737 827469 830756 

"recC-Exodeoxyribonuclease V, Gamma" 

KRSAKLPASGASKRKGRAKKKLTQER I FAFSVRVLP SNRKNAKRNLYKLSF I IVRKCWT 
S ALNDF FLT ETVMNAT KHC RAS FSNS PRHLLAQLAED ITSTHQKPFTKRWILVANATTGH 
WIKNQLVHVLSDHIFMGSTIFTASDSIVKHLFLGSGCSQPNIPDYLTLPLLINNILEEIS 
KASKFErKREFLSPPTYETTKKLAAAFKQFHTFSQRPTKNASHYQELFQILESHFSSYEE 
MFTTILNNRTQEEDCSLHIFGYAHLPI<HLAEFFINLSTYFPWFYCFSPCREYFX;DLIjSD 
RAI DFFWNQLPDS P I KNAWEHYVLSDRQALLANLAHKSQSSQNFFLDRE I DYQEMFLPSK 
HDSSLGVIQNS I LDLKPTS PQDFSQTKQT IC I YRALNI PREVQEVFCKVTELLHRGVSPE 
E I F I LS SH I ESYKVHLNAI FN P HVP I YFT DEVDPRAEDL RNK I LLLS S I LQTQGDLH Y I L 
QLLTH PQLQQP I DQNKVPYL I KKLSS EWGK I S SKDRASGQQMKALGDL I LEEYPFHQEGG 
RVSQVEVWETTVPL I YF IQER I NLYLSSSQHS YEDLFQNVFSCLEK I FVLS PEETS F ITT 
LRNSLFPTFATSSCSLLFFTDFCLDFLLHFHKPSPLYDKPGPYIGSLSSLSLIPKGYVFI 
LGANKTTSSDIFDLI^RTTTHEELAFSSTEDEENFHFLOILVSTKHELHISYISSAAQFN 
L PS PFLNH I KETLDLPVETLPTQPYLSAFFKNKACLHTSOEYNYSLAHAFYSKKALLPSL 
F I PTVKQVNLPQH LSLNE 1 1 KG I FSPLDLFLKTNYNLR I SYPEHLKKOQKLFPTKHQI ED 
FWN ECFVDKEHDLI PS I SPHAEELFTYYREKT I LLRNGLDKDPKHS PYTVTFSSS I FEER 
PYHE3YLFPPLSLSFWNPV0IHGTIHGVCNEGLYLCSIDPRDSLKKTTRTLGSLPETSS 
EOKQLLERYVALAVLOMSOHLSSDSALIKLTSFHTKENHHPPFSDPEGYLRKVLEVYHLM 
SSQPIPLLSPLCWKTLDDEEKFHOAVLSAISEEAKNPSLPIFWQFHNRNIEEILNHVCAS 
ERLKILSLFRGPCEAV 

CPn_0738 830719 833895 

" recB-Exodooxy t ibonuc Lerise V, Beta" 

KFYLFS EVPVKPFN I FDSNSS IOGKFFLEASAGTGKTFT I EQ IVLRALI EGSLTHVEHAL 
AITFTNASTNEL KVR I K DN LAQTLR E LKAVLN CQ PAS L PT Y L D I NC hA/KQ I Y MQVRJJ ALA 
TLDOMSLFTIHGFCNF\'LEOYFPKTRLIHKNPALTHSQL*/LHHITNYLKQDLWKNVLFOE 
OFHLLAVRYNETSKHT:;SLVDKLLA^YTOPI-SYFGSRVERLEOl3LWHCjQIYNSLLEIP 
K0VFLD0LTAHISGFKK0PF:ULDDLHMFVDLLYTSETH3SLFSFFKIAETFNFKHRLAR 
YKr<:AAPTVLENM. f W\'EF<TLEFCriLDrUKNTLLVDLQEYLKgNYTPWLSPDESVFALEKL 
l^fir-EAOPWOALRFOYOLVL rDF.FODTDKC/y//" I FSHLF IS PKFTGSLFL ICDPKQSI Y 
t>/R:JADLl v rYLTAK:-SF::EDKOLOLVTIWRf;Tr'KLMEAriJQ[FGKI::PFLEIPGYLPIEY 
HAL.Nl'OU:J ET FENrrHAP 1 1 IFF FY ETIK Df J A LW I FH EALPLQKEQK r rLt^NMWLVS DSN 
OAFELISYAT CPV:'F:"KNK:; t FHI.TETH [ l.TTALLEAILMPENYEK CUK I [iF;>. f ILFCL.1L 

DKvrTKKEPrrtYFO:u,H:;Y[::iifi^rM.A'f'FY[''/M , rr</;tr/LF:::;pf'00i jfoemeklcgy 
ijrr[:;:;YPYiioLLiiLKNF:;hr^;i<v/r^:i:tAf:;:JY:;EDLE'rLKiTT[ii:;::KiU.EYDiVFCPG 

[EK^KKNKSlJi'ELLKKM'i'VAtr'rRAKKOLY LI* I.'ITQPPrJI/jRIjilALTNYVKLEGTQnSAYD 
IAI!l[.IfyFJIPDLF:;Y:U ( PKDH(:ilArr7I,NLl*LLETFALK'/TPPKT[FSF:;:;TKFLLDTMK 

[jS(j:;rpY::KLrr:;KooLp[>;EKT^njiiK[ij:r:roF:jr.t>oryi'EYiw::TtMRF[KHTH^ 

KErrnLKLL.:-KTFF:'rt.TF:::\tTF.';t.:;'jVt ,1 -f \Y I FRETLTLFLEhJiJKIWOCVCDLFFEHE 
< IKYY [ [DWKTSFL/arfNUDYr.K:;!!!,:; I y I KOKKLDYVIP rYVKAVKKFLNOFEUmDVEL 
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gv r f r Hf ; t (yrv:NCFFALN.';:;ED I pnkmpka i^kcoa 

r:PnJ)7j'> fl]4 8*.>2 fjJ3861 

TT IhM hypothetical protein 

CKVLFKLM3Y3LRNKKTKIWrriALCILSFRSIP0EVYDKIR3SFVSLHVKFFPKIKQ 
API33HLANLELENLVLKERVA5LEEKLKLYEVSNHTPPLFPEILTPVFHKLVBGKVVrTlD 
YTHWSSSCWVNVGKTHG I KKN3 PVLSGNV LVC LVDYVC EHQS R I RLITDVCMKPSWAMR 
GD [Q.^W r KH.TLR EL [ PQVEO L3HAY I LEKDKYEK I S0LQELD3LIQGEGENQALLRG I L 

. : ■;■;< ;■ ialwk; . ;. :i . -\,v: \v: :y-\-::vf ;ktll( « w ■ : :.* — t.".li> :vfw-g:.; //apvtkvkae-rd 
■ ;ai tfk : i-:.-v. . :: .i-x ki .me; : u-v\.i:vu\ nr. ~ r r i r ) llw& 

CPn_0740 836054 834864 

ty rB-Aromatic AA Aminotransferase 

SYWSFF^^HIPTFSPDAIIJGW^AlTFADKRPEKVNLVIGVYEHPQKRYGGL^CIRKAC5VI 
LEEEQNKSYLPISGLQIFI^EMRELVFXSAVDPSAIVGFQSLGGTGAI^LGARLLSVAKGS 
GKVYVPEGTWSNH I RI FSQEGLEVI RYPYYSKEQKQLLFEPL IAFLKEVEKNSVI LLHGC 
CHN PTGVDFTEDMWKELAI LMKEREL I PFFDTAYQGFAHG I ELDRKP IEIFIS EGNTVLV 
AASSSKNFALYG ERVG YFAVHSTFTDELVK IHSFLEEK I RGEYSS PQRWGVE IVSTILSN 
PYLKEEWQSELNFIRESLGKMRTRFVQALRKVAGHTFDFLLSQHGFFAYPGFSDKQVLFL . 
REQHAVYTT AGGRMNLNG ITEKNI DHWQSF IQAYEL 

CPn_0741 838383 836185 

greA-Transcription Elongation Factor 

EY I FRLKTG D I VDYLEKLQVL I EEGQSANFLS LWEEYCFNDWRGRELVE I LEKVKSS SL 
ASLFGKIVITIWPLWEKIPEGKDKDRVl^LILDLQTSNSOMFFDIATEYVNKKYSGEENF 
NEALRWGLRDGRDFQFSLSRFDFU<HMHKGNP/FHQGGW3VGEVMGVSFLOQKVLIEFE 
G IMS AKD I S FETAFKS LT PLSGDHFLSRKFGDPDGFEAF AKENP I EWE ILLRDLGPKTA 
KEIKDELVDLVIPEADWNRWWQSAKTKIKKGTRIISPDNPKEPYVLSDAGCSHMGQLERK 
LGLSL^SA£KISLIYHFIRDLHSEIJCNIEIRXSLVKAI^DLDVEEGNKSLII^RELL1*SE 
YLGIKDAS I DKEYITSLSEDOTSRIXENMPIVALQKSFLSLVRKYSSFWQOVFMQILLYT 
TSPTMRDFVYKTIKNDPSSVEVUCKRLLDSAHQPMMFPELFVWFFLKI^ 
KEVLRLFLESALNFMYQVASTPHKEI/3KKLHHYLVGQRYLAVRQMI EGASLPFLKELLLL 
STKCPQFSSSDLNVI^SU^EWQPTLKKHKSNVEEEWLWSTSESFSRMKAKLQSLVGKE 
MVDNAKEI EDARSLGDLRENSEYKFALEKRARLQEE I RVLS EE INRARI LTKDLVFTDKV 
GVGCKVTLKGDAGEWEYT I LG PWDADPDSC I LSLOSKLAQNMLGKKLNDW I LQGKEYK 
ISRIQSIWEEHGA 

CPn_0742 838442 838888 

CT635 hypothetical protein 

TKMMVI VMNSKSAQKI IDS IKQILTIYNIDFDPSFGSSLSSDSDADYEYLITFCrOEKIQE 
LDKRAQEI LTQTGMS KEQMEVFANNPDNFS PEEWLALEKVRS SCDEYRKETENLINE ITL 
DlgfTKESKRPKQKLSSTKKNKKKNWIPL 

CPJ0O743 838956 840362 ■ 

"ni^EA- Ubiquinone Oxidoreductase, Alpha" 

IF^ITVNRGLDLSLQGSPKESGFWKIDPEFVSIDLRPFQPLSEJOi^^ 

IA^KHFPbTIYITSHVSGVVTAIRRGNKRSLIJJVIIKKTPGPTSTEYTYD 

EBFKENGLFALI KQRPFDI PAI PTC/rPRDVFINLADNRPFTPSPEKHLALFSSR£EGFYV 

FV^GVRAIAiaFGLRPHIVFRI)RL^^ 

NEKE^A^FTLSFQDVLTIGHLFLKGRIL1^EQWALAGTALKSSLKRYVITTKGASFSSLIN 
^^5DNDTLISGDPLTGRLC?QC£EEPFI^FRDHSISVLHNPTKRELFSFTJ^IGFT^KPTF 
TlflTYLSGF F KKKRTYTNPDTNLHGETRP 1 1 DT DI YDKVMPMR I PWPLI KAVITKNFDLA 
NEpSFLEVCG ED FAL PTL I DPS KTEMLT I VKES L I EYAKESG I LT PHQD 

Cp|fj744 841387 840389 

hemB-Porphobilinogen Synthase 

EMESLTLSRRPRRNRKTAA I RDLLAETHLSPKDLI APFFVKYGNNI KEEI PS LPGVFRWS 
LDLI^EIERLCTYGLRftVMLFPIIPDDLKDAYGSYSSNPKNILCHSIHEIKNAFPHLCL 
ISD^DPYTTHGHDGIFL^EVIJroESVRIFGNIATL^ 

RS^QSGYSKTSIMSYSVKYASCLYSPFRDALSSHVTSGDKKQYQMNPKNVLEALLESS 
LDEEHSADIL^PAGLYLDVIYRIRQNTCLPLAAYQVSGEYAMILSAFC^WLDKETLF > 
HEgfcl AI KRAGADMI I SYSAPFILELLHQGFEF 

CP?yp745 ' 841903 841742 

No " L°bust homolog present in Genebank/EMBL as of 11/7/98 
Vt)S|FDDWRASSLOGSTTYWAYDPKHTLAYGFCNQVSVKKFHLKPPKSQEKFL 

CPrjI$746 • 841939 843567 

CTS"32 hypothetical protein 

fsgrcpfsfevfmlgkeeeftckqkoclshfvtnltsdvfalknlpewkgalbskysrs 
^/lglralllkeflsneedgdvcdeaydfetdvokaadfyqrvldnfgddsvgeZggahla 
menvs i laakvledar iggs plekstryvyfdqkvrgeylyyrdp i u^safxdmflgtc 
dflfdtys a l i pqvra y f ekly pkds ktp asa yats lrakvldc i rgll paifr ltnlgf f 

GNGRF^QNLIHKLCX3HNLAELRRLGDESLTELMKVIPSFVSRA£PHHHHHG>WQYRRAL 

keqlkglaeoatfseemssspsvqlvygdpdgiykvaagflfpysnrsl'ddlidyckkmp 
hedlvqilessvsarenrrhksprglecvefgfdiladfgayrdlqrhr/ltqerqllst 
hhgynfpvelldtpmeksyreamepanetyneivqefpeeaowpma/nirwffhvnar 

ALOWICELRSQPQGH0m-RTIATGLVREVVKFNPM-/ELFFKFVDYSD/DLGRLNQEMRKE 
PTT r 

CPn_0747' 843949 844053 

CT6 3 1 hypothetical protein 
RTCMGCKGAEVO I LSSRSLSCMKI LSSSLFYKKFC 

CPn_0748 844096 844121 

LspA-Gerjnyl Transtransferase 

GTLVLH ALDTYR PS I ESA I EKALEGFGP IGH P I RSPVEYA&GCCKRLRPCLVCMMAQCL 
GLNflDVMDSALAVEFVHTlJTLIADDLPCMDNDDERP.GRPWHKAFDEATALLASYALI PA 
AYSHLRLNAKKLKEQGCDPREIDrAYNIICDITDKTIIGgGGVLCCOYDDMFFSNRGOEHV 
OS IM I KKTGSLFE I AO r:iGWLFGCGDPOFAP I ITSFSMQfCLLFQ I KDDF3DLQKDSOOI 
« M.NYALLFGEKAALELLARGCNNCLELLDRLSA0GLK/f5SEFETI tSSLGSF 

«."I*n_074 , .t H,l'.ti3M 84101)1; 

<) hull -U&P -. : tcNAc I v t f iphor.pht iry Us«< 

VCYMTYf iA:*:: I RINCDFLYI-Et V. IKAI lY'l'WD I LD^ML>jMLENHVF:X'. [ HGTVE.'XiVTLKN 

i ek i e r aemayvk: :» ;ay r vf ;rv r u;:;utevrho/ylrgnv rn-.-:m :wghcte I kmsylc 

MHTKAAHKAYI ; !O^VI.:":U';VNLA»AGVRO\NFnt((D''!r'M I YVr<:"T:TDKr;KK IDTGRRKLGAF 

u:K( ;7Ai':rNvviNi"\;on rt.PHTR [RRiovr 
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tCtD/CpxR-KTH TiMn 
Doman 

KI TDF I LR I HSYNLFCFHM ICDK 1 1 L FVT EDL3L/SS Q LK DLASQR3 DYQ I LV3 PVF PTS F 
ESVAIFCEYLLLPEQ I FSPG I FPEEDL I VLFD^EEAITKVLNCCATCYLLRPITAKVL 
DAVIRAFLRQHEVLEHS I PDTMTFGDHTFRVL/fLVI ESPEGSVYLTPSEAG I LKKLLINR 
GHLCLRKNLLAE I KGNTKE 1 1 ARWVDVH I As/RKKLG PYGSK IVT I RGVCYLFSDDDS I P 
LQNHDNTAHPNEE 



CVr, 



H4S/07 



MFRC ILFGI FLLTCFSSGGVLY YLFC JtfDF J IGPKEKSRSVW I EEEKiFTDSVLHHLPSC; 
HQHLH I LCFQGFLLQKQQKFSQAEK I BSKVYDEAQDGPFLFKEEI LGSRLINS FFLEKTD 
VMET I LCLLNQRCPNS PYYHLFKALWCYKQKLYREV I EQLAYWQEEKTRALAPLLN I S I E 
QLLTDFLLDY I SAHSL I EOKMFPEGRVILNRN INRLLKHECEWNAKTYDRI AILLSRSYF 
LELVESKSAD I YFDYYEMVLFYLKtf I Y I LEQC PYAELLPEEELVSL IMEHVF I LPKDKL Y 
PLIQU^EMWQKHYVHPNSSLWOSiVDRFSTHMEGAIRFCEALVSFSGLEELHQOIITTF 
EEIJ^NKVC^IKTEEAKCXTVAL^ILDPSISISEKLALSSDTI^NIVSGDDEQHTKlJy^ 
LDLWEA IQS YD I DRQQ LVH H LVYG AKDLWKKGGND EKALNLLQ L VL RFT SYD I ECESWF 
LF I KQAYKQALS SHA I ARLLOEKF I S EAN IPSIVIS EAEKANFLADAEYLF AHED YDKC 
YL YS>!WLTKVAP S PQSYRI^flLcLMENKR YDEALEFLCM LS PNDS I NDYKTQ KALAFCQK 
HQSKDRAAS 

CPn_0752 JJ48595 850082 

"recD-Exodeoxyribrfnuc lease V, Alpha" 

GWALKTEFAPFLEDLVHTOVISPLDIAFASKHtSSDFEESFVFLAVSSALWRYGHPFLSL 
EENRI RPSLGG I S ETDLYRGFJ^PKEARDKLFWVSGRLYLRSLYT I RSKLLDKLSLLC 
SATPNYFPPS I DSS I^SEEQNF I FNK ITQGCFS I VSGG PGTGKTFLAAQL I LS LVKQQPK 
LR IAIVSPTGKATSMIRQI LMKYNI FDDMVLMQTVHHFLQEYAYRRYNS IDVLLVDEGSM 
VTFDLLYSLVrTTLOGYEKDKKLYTSSL 1 1 LGDTNQLPPIG IGVGNPLQDL IGYFHENTFF 
LKTSHRAJO'GVVTOLTQSVUlGEMISFSPLPSISSAIEVIJaWFVKSLROS 
hlRHGPVK^n-NLMTMIHQRLARSDPDLRI P IMVTSRYETWGLFNGDTGLLCLKTQKLHFPQ 
HEPIDSRALSQ/vYNYVMSVHKSQGSEYDEVI VI I PKGS EVFGVS I LYTAITRAKYRVSV 
WGDPETLHKI? 

CPn.0753 / 851009 850161 

No robusii homolog present in Genebank/EMBL as of 11/7/98 
IMATAHLaRQALLKLRSWT PAI RASGNLF RQQSMS LHNNVLF AGD I VGA I KNSTAI S RHA 
LGSSHYAHAALQKTEGFLGAADGVOTAVAGAMLUGQLLNGSM I FETDEETGELRRCNEAD 
QKLQRRSALT ITG KVARLAS KTLGT ATFLH EMDWS LGANANK IGCKVTSCLNL 
VATGCJS LTESS I SLYR ILSTRPETI SDPENRNKPSAEFAARS KAI RNAF I AWLGDWDLV 
CDALjFrLSLFLPArLGVHAVLIMAILGLISCVINFVKDYAKIG 

CPr/0754 851381 851040 

B0-S20 Ribosomal Protein 
C^ILNLKVL^^GDIMAPKKPNKKNVIQRRPSAEKRILTAQKRELINHSFKSKVKTIVKK 
SLKLDDTQATLSNLOSVYSWDKAVKRG I FKDNKAAR I KS KAT LKVNARAS 

^CPn_0755 851579 852799 

T616 hypothetical protein 

KDLFFMLLVRXWLHTCFKYWI YFLPWTLLL PLVCYPFLS I SQKI YGY FVFTT I S SLGW 
FFAIJUU^QLKTAAVQLLG/rKIRKLTENNEGLRQIRESLKEHOOESAQLQIQSQK^ 
LFHLCGLLVKTKGEGQKI^LLLHRTEENRCUCMQVDSLIQECGEKTEEVCT 
- Kyoqalndeyqatfs EORNMLDKRQ IYIGKLENKVQDLMYE IRNLLQLESD IAEN I PSQ 
t SMAVTGN I SLQLSSELKK I AFKAEN I EAASSLTASRYLHTDTSVHNYSLECROLFDSLR 
EENI04LFVYARQSQRAVFANALFKTWTGYCAEDFLKFGSDIVISGGKQWMEDLHSSREE 
CSGRLVIKTKSRGHLPFRYCIilAI^JKGPLCYHVLGVLYPLHKEVLQS 

CPn_0756 852889 854676 

rpoD-RNA Polymerase Sigma-66 

ISYLPLTKLSSKARNPLVLFOVRKLFMOTONSOATEVSSEEESQKKLEELVALAKEOGFI 
TYEEINEILPMSFDTPEQIDQVLIFLTCMDIQVI^QIDVERQKEKKKEAKELEGLARRTE 
GT PDDPVRMYLKEMGTVPLLTREEEVEI SKRI EKAQVQ I ER 1 1 LRFRYS AKEA I S IAHYL 
ISGKERFDKIISEKEVE^KTHFLKLLPKLITUJCEEDTYLENLLLSLKOPDLSKQEAAKL 
NDSLEKCRIRTQAYLRCFHCRHNVTEDFGEVWKAYDSFLHLEQQINDLKVRAERNKFAA 
AKLAAAKRKLYKREVAAGRTLEEFKKDVRMLQRW4DKSQEAKKEMVESNLRLVI S I AKKY 
TNRGLS FLDL IQEGNMGLMKAVEKFEYRRGYKFSTYATWWI RQAVTRAI ADQART IRIPV 
HMIETI^nCVT J RGAKIa^WETGKEPTPEELAEELGLTPDRVl^EIYK IAQH PISLQAEVGEG 
S ESS FGDFLEDTAVES PAEATGYSMLKDKMKEVLKTLTDRERFVL I HRFGLLDGKPKTLE 
EVGSAFNVTRERIRQIEAKAIJIKMRHPIRSKQLRAFLDLLEEEKTGTSKVKSLKSK 

CPn_0757 854709 855134 

folX-Dihydroneopterin Aldolase 

PCIKNIALVIAI ERYQL I I SKFRMWLFLGCSVEERHFKQPVL I SVTFSYNEVPSACLSDK 
LS DACC YLEVTS L r EE I ANTKPYAL I EHLANELFDS LVI S FG DKAS K I DLEVEKERPPVP 
NLLNP I KFT I S K ELC PS PVLSA 

CPn_0758 £55104 856459 

folP/dhpS-Dihydropteroate Synthase 

RAHSEPRFVCLSLCSNLGNRFKNLQIARTLLGEQAVljGLRSSVTLETEALLLPGSPPEWD 
LPYFNSVLVGETTL3LRELLVTIKQIEKWGRAEESPPWSPRTIDVDILLYGDESFCCDH 
TEITIPLSNLLSRPFLIALIASLCPYRRFCTQGSPYHNFTFGELAHHLPSPPGMIRRSLS 
P DTMLMGWNVTNDSMSDGGMF LDPEKAV AQA EKLFT EG AAV I DFG AQATNPKVKQ FLS V 
DOEWERLEPVLRLLKETWSNRKOYPI ISLDTFYPEI ILRAMDI YPIQWINDVSGGSQSMA 
EVARDCELSLVMNHS33LFVDPKNILSFSVPIGE0LLSWGEKQLKMFSDVGLNAN0VIFD 
PGIGFGKGAA0SLATLYEIAKFKRLCCPILICHSRKSFL3LFGNHDPKDRDWETVGLSIL 
LQQOGVDYLRVHNVAAHOKALSVAACEACAP I 

r;pn_07 59 t=S64'i4 HW^l 

f olA-Dihydroto lor.t* Redu'-.-taE'j 

LLVKPVnP3NFENPUJVE>1CKNR^VRGr7ACDPRGVrOLECKLPWir,TEDL0FFf;ETrQK 
FPIVMGRKTWETLPPKYFVDRA'/'/VF.';HEKRO(.!VHGE tWVT^LF.EFLL.f.OL.'lSPTFLICKi; 
OELY.'JLFLEMOIVRDFFI:*!) TKKEYAf Jt/rFFn^JLLETWTKTVF.RDToK fTTCYYENHHS 

yrrrKNisL 

CPn_()7f",n .^'w.'if, • >.', h,-\.\ 

HTKll hypor.hf-r. ic.il tHf*r* : L fl 

RHGPKLCLE I PKR.*;OI>VTMK ITT/KTI-K I YPYDDLY.'; ILKJ^LPKLNEh:: r W tTSK [VI 

r/:iryjAWELEKvr:KDELiKOE*\iiAwi--/EKYi u yltkkwg i l i vm i m-::;nvr IYFVLY 

PR DFLL-IVNTLf ;CWLRNFYH LEI K.v nil V. I ): :| CPrPLRRf ITMC UK A W \VVV\ .YNYVi :K V 
^ ^[i^k^TY^NLLLXJLS AAA7I X ,'M r ;f-/ IDEOTl ' I A I [ EEAPK ITFI \\V.\ T'TI'L-L'tW: 'TLA 
[AKUEDLYGPLLO^MAWETPAI'T':; 
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Cm_07M HS7rt'.iR 858375 

CThTo hyp^rhrjc ic.il protein 

H I MT3W r E LLDKO I EDQHMLKH EFYQ RWS ECK LEKQQLQ AYAK DYYLH I KAFPCYLSALH 
AfiCDDLO r RRQt LEULMDEEACNPNH IDLWRQFALSLGVSEEELANHEFSOAAODMVATF 
RRLCt^irOLAVfJbGALYTYE 10 1 PQVCVEK IRGLKEYFGVSARGYAYFTVHQEADIKHAS 
EEKEMLQTL.VGRENPDAVLQGnOEVLDTLWNFLSSFrNSTEPCSCK 

•-■A Rn-A f . ■••"inl- i:i.H i -n M-.r^iti 

EELLYFSRFSEKQOTSLCNMETKRSIYMNLPDRKKALEAAVAYIEKQFGAGS IMSLGRHS 
ATHEIST I KTGALSLDLALG IHGVPKGRVI EIFGPESSGKTTLATH I VANAQ KMGGVAAY 
[ DAEH ALDPSYASL IGVN I DDLMISQPDCGEDALS I AELLARSGAVDVI VI DSVAALVPK 
SELEGD rGDVHVGLQAfiMMSQALRKLTATLSRSCTCAVFINQ IREKIGVSFGNPETTTGG 
RAI^FYSSIRI^IRRIGSIKGSDNSDIGNRIKVKVAKNKLAPPFRIAEFDILFNEGISSA 
GCILDLAVEYNIIEiCKGSWFNYQEKKLGCGREP/REEiJCRNRKIJ'EEIEKRIYDVIAANK 
T PSVHANET PQEVPAQTVEA 

CPn_0763 860520 859972 

ygfA-Fonnyltetrahydrofolate Cycloligase 

NFP^DPKIEKSALRKLFISIRRDLSEERKHEASSAVASFVRSFSKE^VVLSFVSFNHEI 
DMOEANRILIQKCTLALPKIDQENLYPVLIPSIDDLISVVHPKDPFSKOTPISSDKITHV 
LVPGLAFDQ0GYRLGYGHGFYDRWLAQHPYPSIRTIGIGYCEQKIDRLPQE5HDIPLSQI 
YLC 

CPn_0764 861819 860524 

CT648 hypothetical procein 

GYKSMDIKKLFCLFITSSLIAMSPIYGKTGDYEKLTLTGINIIDRNGLSETICSKEXLKK 
YT KVD F LA PQ P YQ KVM RMY KNK RG DNVS C LT A YHTNGQ I KQ YLEC LNNRA YG R YR EWHVN 
GNIKIQA£VIGGIADLHPSAESGWLFDCTTFAYNDEGILEAAIVYEKGI^EGSSVYYHTN 
GNIWKECPYHKGVPCCKFLTYTSSGKl^EOWC^KRHGLSIRYSEDSEEI^^ 
EGRLLKA£YLDPCTHE IYATI HEGNG IQAIYGKYAV I ETRAFYRGEPYGKVTRFDNSGTQ 
IVCTYNLUX3AKHGEEFFF^PETGKPKI£IiWHEGIL^ 

KSGLLT I YYPEGQIMATEEYDNDLLI KGEYFRPGDRHPYSKIDRGCGTAVFFSSAGTITK 
KIPYQDGKPLLN 

CPn_0765 862415 861801 

CT647 hypothetical protein 

ttiyikujgpxmkkwisililsfi^llsilpvlaitinhvkisqrwsdlnsqiltlkvir 
dhedqvikhnar i skdrnnls i eslnasckqlrpls kererlnklnsns llaqs kevwer 
krv^eksnholvwnceqmhotfatvrleqatemdnedieslfsl™ 
kmwqttplgnevwlthaeaisrwi 

CP?C0766 863785 862394 

CT&&6 hypothetical protein 

A^KLPWHIGLTKAENNTIKIAILQKTCKGWIVCHCEQIPEGKTWSLPKKYFAAPTTF 
SI^SDILVKSSSSSLKOTKNILKVALTNLEASIJ^PWESLIVQPQLGKPTDPI3ETPLTL 
W^KOTLJOCELSFLSQAQIFPDKI^CRAADIFFLAEQSPUCSLPAYI^IYGGSEEVTCI 
F^QiHAIAVARSFSraSTKKSCDDIHATLQYIOETFPC/IVLPAIHVAOISP^KILEQK " 
LSjLPLWCOSMTYGVEDEDWEIYGDTIAAAHHGASRRPLTFPYDATSVSPAAQKHWLLR^ 
Sia^KYALMATVWSLGSVLKUCSI^SSASNH^^ 

KN&ftSNYPLLPT I PTS EQTLKFLLALGKSS PS I KFS YFS YTMTSYPSKDNPSLPYSALV 
VKI^30PEDIPQFLKKISSHPKLQHVSESLEI)QRSFKLQFTLSS 

CPfO)767 863878 864177 

CTj645 hypothetical protein 

NI^SYLLRTAI^AnfSFLILAYIFASWVPDCOSARWYQLVSKCVDPFLNFFRRFVPRIa 
IDJ^PFVGLLCLG ILPFVILRVLRF 1 1 LN I FHS PWLLQYL 

CPrr!0768 864144 865163 

yohl/nir3 -predicted oxidoreductase 
YFSFSMAAPIFIKNILLRSSIWAPL^FSDYPYRCMSALYQPGLWFCEMVKVEj^ILYAP 
ERT^KLLDYTJEMdRPIGAQI^GSNPETTSGEAAKILEGLGFDLIDLNCGCPTDKITKDGSG 
SGLDCTPEL IGRILDKI INSVS I PVTVK I RSGWDMEH INVEDTVR I IRDAGASAVFVHGR 
TR^3GYHGPSKQEYISRAKAAAGKEFPVFGNGDI FS PEAAQAMLTTGCDGVyvARGTLGA 
PWI^QIQDYLTTGSYEKIPFIKRKAAFLE^RLVEDYYOSETKFLSETRIJLCGHYLISA 
AKVWLRSSLAKATSYQEVYQLVNDYEEADDSSLETFVKC 

CPn_0769 867763 865121 . 

topA-DNA Topoi some rase I- Fused to SWI Domain 
S IQGPHAIRLMKKSLI IVESPAKIKTLQKLLGSEFVFASSIGHIVDIff AKEFGIDVDHDF 
EPQYQVLPDKQEVINH I RKLAAKCEKVYLS PDPDREGEAI AWH IANQLPDSPLIQRVSFN 
AIT KNA VT EALK H PRT I DMALVNAQQARRLLDRI VG YK I S P I LSRKLQQ RSG I SAGRVQS 
VALKLVVDREKAIDAFVPVETWNLRVl^QDPKTTKTFWAHLYAwfcKKWEKE I PEGKTEN 
DVLLINSEEKARHYAELLEKSSYTITRVEAKAKRRFAPPPFIT/TLQQEASRHFRFSASR 
TMS I AQTL Y ECVDLDS EDSTGL ITYMRTDSVR VDPEALTTVRST IQCTFGKEYLPEKANV 
YTTKKMTODAHEAIRPTDINLTPDKLKNKLSDDOFKVYNLiykRFVASQITPAIYDTLAV 

OittdteidlrasgsllkfkgflavyeekoddendqeedhbCpplhaodalikeevsqeq 

AFTKPLPRFTEASLVKELEKSGIGRPSTYATIMNKIQSREfrTTKENORLRPTELGKIISQ 
FLErTHFPRIMDIGFTAl^EDELELIADNKKPWKLLLOEFWTTFLPVVITAEKEAVIPRIL 
TN I ECSKCHKGKLVK I WSKNSYFYGCSEYPECDYRTSe/eLAFNKEDYAEDTPWDS PCPL 
CGGVMKVRHGRYGTFLGCEKYPECRGTISIHKKGEEIEQEEPIPCPAIGCNGKIFKKRSR 
YMKI FYSCS EYPECSVIGNS IDAVITKYSGTEK I PY^KKTPTKKKSSAKTTKAAKTPSKK 
CKAKSSVKKSS EKKTG PLFLPS PDLAKM ICNE PVS KlEATKK IWDY I KEHQLQAPENKKL 
LVPDNNLAT I IGPNP I DMFQLSKHLSQHLTKVSNDESSASS 

i:Pn_0770 968322 $69131/ 

CT f A2 hypothetical protein 

kprtrnveklefvtclcspdddlitfnkogl/agpeeekvaflvrsnamldagpetpasf 
pe:jlre0f0efpeyvevly3necldvweacotwiunievtiqlrkhhrkasrwlgmysrd 
t:vlj\meavriavrmkfhepvfeevu\y0tsbwwrrffgplfr3pgesylllfftilglgi 
: ; t.wy pag 1 1 .1 mlvl pm y f l ,mr lcmaqs y lfyramk k i p km lc v p plwvllrltdke i kmfa 

Kf-;f-rPVLEHYARKRKLRN\'RWK0IYQ:7Y/ 

'•HiJ)77 1 K7i) r i 11 /Sh'iH.l 

RNA T'otyitiLT.,:;,; :j Ujnu/i-l 

YV\ A ;.Y.)\ .V.EV.W/'Vi 'VRrTNiJTF^YLNOTri'JPOE^LYTRLLPQIEEAFUTAEERFIAHQI 
ANriI.:;i)W!lJ^.l<NIMXFA0FXE^LEKIIIKVWrvrtOriLSPEfl[AGPSL0SYWMKLLRNSS 

"LOAY::ivr<rx;YrLmrEFAMMKKF3L:;L^ 

TN .I'ht Yf.f-'Y:;^ :.';WK r ITV^MGhP:, rKt,NKF/rFHFYEIILPKEEQKNL3COII-3AKWLIK 



pk^^Bgkipa 

qd:S^Penvlqw 



NLRKREC?TLLOVMETLLPK^^»GK t PAPYPUJ I KDLAEDLJFHE3T 1 FRA I ENKAVA 
AP IG t FPLKHLFPRC I HQDS^PeNVLQW r RQ^t ATEQTPLSD3V I3DR ITAKC I PCAR 

rtvakyraolk ilpankpkklfy i r3:;n3hfrdrof 



CPn_0772 872400 8704; 

uvrD-DMA Helicase 
KLGLIMTCISELNEAORKAVTAPLNPVLV^AGAGAGKTRVVTYRILHLINOG IAPREILA 
VTFTNKAARELKEP rvNCCASTNEFDVPtfc/CTFHSLCVFTLRRS rNLLNRENNFTIYDQS 

!;A^:^".:K!^A:..v;!r;^>"■::;.A::K^CAi^.y^AKr:^:. : '.?\::\ *;rv t::::v?ey;kkl 
iKAriAUM-nM ™/:-L:^»:.W'KAo/.Yri<.': wka;.;.. rv:<<; \ tmha<;yt:^cll.;koh?. 

NVFAVGDPDQS I Y jWJAN IHN I LNEEN DY PN AKVLC LEENY hS YGN I LNAANAL I KNNA 
SRLEKELRSVKGPGEK IRLFLGSTDREEADFVAAEI LOLHRVGNIKLRDIC I FYRTNSQS 
RTFEDALLRRRI PYEI IGGLSFYKRKE IQD t LAFLR I F I SKSDIVAFDRTVNLPKRG ICS 
TT I FALTQYAIAQGLP I UCACQWLDTKDVKLSKKQOEGLQEYLALFPQI EHAYNTLSLR 
DFIESWRITCTLEILK£DAI!T/kDRKSNLEELYHKA^ 

SDDDl^LTADRVNUfTUiNGK^EFRVSFLVGLEEQLLPHANSLjGGTYENIEEERRLCYV 
GITRAQDLLYLTAAQVRSLWGTVRMMKPSRFLKEI PKDYMIQVR 

CPn_0773 8/2485 873195 

ung-Uracil DNA Glyyfosylase 

FMQNAT I DQLPVSWQEMPLCWREQLKE^SK PYM^ 

ALRSTPFD0VRWILGJ5DPYPGKGQAHGLSFSVPEGQRLPPSLINIFRELKTDLG I ENHK 
GC LQSWANQG I LliNT^TVRAGE PFSHAGKCWELFTDA I VT KL IQ ERTH 1 1 FVLWGAAA 
RKKCELLFNSICHQHJ^SPHPSPLAAWRGFFGCSHFSKINYLl^KLNKPMINWKLP 

CPn_0774 / ' 873183 873425 
CT606.1 hypothetical protein 

LEAPMNEGIHSVCFQKTPRLTAKSWSMEMl^TTC^LPSAEGMPSVA^EADFLRAEALL 
AEMREIRGCLEQSLRTLVPSE 

CPn_0775/ . 874040 873414 

yggV fandly 

ERFMKIWASSHGYKXRrTKTFLKRLGDFDIFSLSDFPDYKLPQEOGDSITANALTKGIH 
AAIWLGCWVIADDTMIJ^VPAI^LPGPLSA 

AYFECjAA/LVSPWEIFKTYGrCEGYISHOEKGSSGFCYDPIFVKYDYKG/TFAEI^EDVK 
NQVSiWAKALQKLAPHLQSLFEKHLLTRD 

CPnL0776 874180 875487 

605 hypothetical protein 
?A FVLKNFYDCLLMFFQFLSFTMKKIFYSFVLLSC I FPYVGC AQVFVGLDR I FS EGEYTR 
CICGKKIALISHSAAINSRGQDALSVFYSRXHDCTVEILCTLEHGYYGATPTETV^ 
f RY PNLRSVS L YGVKEVPKEVAEHCDVFVYDVQ D IGVRSY S FVTVLMQ IVKAS ERYGKQL I 
VLDRPNPMGGRIVDGPLPNPTTSGSLAI PYCYGMTPGELALFFKKTYAPNANWVI PMKG 
WNRShfTFDETGL IWMPTSPQMPDPQS PFFYAATG I LGALSVAS IGVGYTLPFKVLGAPWM 
CXIEKVADELrmMKLPGVLFLPFFYEPFFGKYKMIMrSGVlXVI^DPKIFYP 
VLKALYPKQVEQTLKS I ER I PARRSS ICNLFGGDEFLS I SHKERYI VWPLRRLCKESRES 
FHQLRSSCLLSEYAES 

CPn_0777 875586 877178 

groEL_2-heat shock protein-60 

TS EDRWWVFKSQFEG LSAUCRGVHALTKAVT PAFG PRGYNW I KKGKAP IVLTKNG I R I 
AKEI ILQDAFESLGVKI^EALLKWEGTGDGSTTALWIDALFTQGLKG I AAGLDPQEI 
KAG I LLSVEMVYOTLORQAI ELQSPKDVLHVAMVAANHDVTLGTWATV I SQADLKGVFS 
SKDSG I SKTRGLGKRVKSGYLS PYFVTRPETMDVVWEEALVLILSHSLVSLSEELZRYLE 
LISEO^m^PLVIIAEDFlX}^^RTL^L^^aJ^GLPVCAVKAPGSRELROVVLEDLAILTG 
ATLIGQESENCEIPVSLDVLGRVKQVMITKETFTFLEGGGDAEIIQARKQELCLAIARST 
SESECQELEERLAIFIGS I PQVQ ITADTDTEQRERQFQLESALRATKAAMKGG I VPGGGV 
• AFLRAAHAI EVPANLSSGMTFG FETLLQAVRT PLKVLAQNCGRSS EEV I HT I LS HEN PRF 
GYNGKTDTFEDLVDAGICDPLIVTTSSLKCAVSVSCLLLTSSFFISSRTKT 

CPn_0778 877400 878092 

tsa/ahpC-Thio-specif ic Antioxidant (TSA) Peroxidase 

APVAQSDRVPGYEPGGQRFESSLVRNNKRVEEEVFhfTLSLVGKEAPDFVAOAVV^ETCT 

VSLKDYLGKYWLFFYPKDFTYVCPTELHAFQDAijGEFHTRGAEVIGCSVDDrATHQQWL 

ATKKKOGG I EGITYPLLSDEDKVISRSYHVLKPEEELSFRGVFLIDKGG I IRHLWNDLP 

LGRSIEEELRTLDALIFFETNGLVCPANWHEGERAMAPNEEGLONYFGTID 

CPn_0779 878502 878095 

CT602 hypothetical protein 

RFDLI FOMKFTVALFGEAEKGSYDTAYFCRSLVDLHNYLGDVSSPG ITLAI KTLLSDYNV 
VYFRVT?EECYCVDSYFFGLHFLNTQTTLKN I IAIGLPCVGNQH 1 1 EASRSLCQKHNSLLL 
FFDHDLYDLLTFNQPF 

CPn_0780 879241 878591 

papQ/amiB-N-Acetylmuramoyl-L-Ala Amidase 

HGNK I AVQSLRFMHAKLSFFILLSLLFSG IDCSRLHAAGRSPSUXJVtAEI EDI SAKLAS 
H EVE I VM LS ERLDEODS KCQKWT AAK PET LAO K I RELESDOKALAKTLAVLTTSVKDLQT 
NLQSKLOEIOKDHRALAQDLRLVRRSLLALVDSSSPGAYADFSDPVPENIYtVREGDSLS 
KIAKKYKLSVTELKKINKLDSDAIYAGQRLCLQRNKQ 

CPn_078l 879851 879198 

pa L -Peptidoglycan -Associated Lipoprotein 

ONC YR3 RRKTVP LLGC F PS ATDK ENTMN I HS LWKLCTLLALLAL PACSLS PNYGWEDSCN 
TCHHTRRKKPSSFGF\'PLYTEEDFNPNFTFGEYDSKEEKOYKSSQVAAFRNITFATDSYT 
IKGEENLAaTNLVHYMKKNPKATLYIEGHTDERGAASYriLALGARRANAI KEHLRKQG I 
SADRL3T IS YGKEH PLNSGHNELAWOQMRRTEFK I HAft 

C.Tti_0782 BS1077 879773 

r.nllj poLyL; t )cch.uide transporter 

CUt^MLROLCFOVFFF^FA^LVYAEELErAA^^EHITLPrEV^COTDTKDrKtOKYL^CL 

TFaF^KntAtA^D{:i^rTA.ASKESSSPLArsLRLHVPOL3r/LU}nr:KTPcrru::;hT[:;oN 

LSVDI'OK rifHAA[7rVHYALTGI PG tCAGKrVFALSoLGFCOKLKC^^LWTrPYtJ' 1KNI.AP 
f/nT^.';i.;;tTPKWV\A\^:NFPYLW3YKYCVPKIFLC.SLEriTEnKKVLrf J K(;NOr>l!TF.'-; 
(^^KKLI.AI-VAI/rYr.NriH.tMOlM-.'U.T^GPMGRPPRLLNENFGTOf^NP^FNrRn-gi.VFr:-; 

NKU';i^^n..YiM::i.ni'KivAiiuxTKKYRN!;scPAW5PD^KKiAF(:::v[K(;vr<yif:rYni.:; 

;KijYOi.'rt*::rrNKR::r:'.wA t [i:irh[ j vf.';a(.;naeegelyl.I3lvtkktnk i a k ;vw:kri' 
n::w;AKrv.HMKi<TL 

f."l'ii_07Ht US! HUH HHILU'; 

rMMKYLPYtAtTArillUIl l ( I.L.VFA::r'f.F'KKRLOPKAFOEKLVTrOr'Kr'l'VI"ri':;VVVljP 



108 



ET^MSWI 

SFL^PSAAJ 



S PAPTVAKKTTATEKP 
fPSTAQLTMHS ELKAT 
lAADKQLLTQR IQALPFQ 



AKT [RP: TVATO PQKQA KC3 P PO envqkalqk P I PKV 
ppqTTKKNT0L.1KTQLQTL3EVAQALSLHVOK I EKS I 
QEDELCELFRTH tALPSKGYVRIKLVLSPNGEIQECSFC! 
KFLEKYKV5KN C3FH I KLVSNE5 

CPn_0784 881892 
exbD-Biopolymor Transport Protein 

nRADSTFTTFFYP^nrTOYKrWKYPFTEEIEEEPL'/NLTPLIDEVFVlLMAFIVAVPLIK 

i,i^:iALA{t:-!VK/iv;t..:.:K:itj:J!AViKVi-MiJft:::::"-N!-:ii; r:i*jK::r/hiJ?i.iMt\\'\ '"r 



Lr^B" 



17 



vtr/KiiAii:A/,';KHHLiiVALor: 



CPn_0785 883039 882296 

exbB/tolQ-poly saccharide transporter 

DHLYFETLSVNKDFYSMVHFSHNPI IQAYTEADFFGKSI FFCLLILSVCTWTVLHQKLAI 
QKNFLKAGKSLKDFL I KNRHAPLSLD I H PELS PFADLYFT I KRGTLELLDKNRQSAPDRG 
PILSSEDIQSLETLLGAIMPKYKAIXHKNSFIPATTISLAPFLGLLGTVWGILVAFTHIS 
SGSSGNSAIMEGLATALGTTIIGLFVAIPSLIAFWLKAHSSELISEIEOTAYIXL^ 
VKYRNTNL 

CPn_0786 883137 885293 

dsbD/xprA-Thio: disulfide Interchange Protein _,™ t 

NHGVILNKFKTYLGTALIAPFFSFPALSGSFSSIQAEEITQQVNHPGAELLSEGSYIPGL 

QTFRLGIKITASKGSHIYWKNPGEIGSPLKISVQLPKGFVVEEEHWPTPKVFEEEGTTFF 

GYEDSALIVADVRAPEGYTPGQEVELRAQVEWLACGDSCLPGNVDLKLTLPYEEKEPSLY 

PDTHAEFTKTI^AQPRVLENDHSVQVACGKGNEIILNISKKINATKAWFVSEKADKLFAY 

AET5YSGGTGTAWRLKVKNLSGV0KNEKLHGIL^LADHTGRPVESLTIHSE^K?TGSAV 

AG LSQY IT I LIMAFLGGVLLNIMPCVLPLVTLKVYGL IKSAGEHRS SVI ANGLWFTLGW 

GC FWGLAGVAF I LKVLGHN IGWGFQLQEPMFVATL I IVFFLFALSS LGLFEMGTMF ANLG 

GKIX?SSEMKSSNNKAVGAFFNGILATLVTTPCTGPFLGSVI^LVMSLSFLQQLX 

LGMASPYLVFSWPKMLSVLPKPGGWMSTFKQLTGFMLLVTVT^ 

LLGGLWl^LC^WILGRWGTPVSPKKORVCASLLFFAFl/GGAISVSGIASH^ 

SVNEDSLWQPFSLEKIAOLRACXjRPVFVNFTAKVTCLTCQMNKPVLYGDAVQKMFET 

TLEADWTRKDPGITEELARLGRASVPSYVYYPGDiNSAPVVLPEXIT^ 

CPn_0787 885604 886401 

yabD/ycfH-PHP superfamily ( urease /pyrimidinase) hydrolase 
. T RROPVDLADAHVHLSDDAFEEDINSVI^RAQDSGVSL\AWmEKEIJJRSFAYAERFP 
KIPJCHVGGTPPQDVDQDIEEDYRNFHAAAHSKKLAAIGEVGLDYCFATEEGIAEQKEVL 
QRYIALSLECELPLVVHCRGAF>TOFFRMLDQYYHNDPRSRPGMUfCFTGTLEEAQELISR 
GWF I S I SG I VTFKNAQDLRDLWELPLEHLLI ETDAPFLAPVPYRGKKNEPAHVLHT INA 
VAN^GMFPQELAALAYKNVLRFLHG 



CRtn078 



886521 887432 



sdKC-Succinate Dehydrogenase 
SLiMSLRMSRHE ICPEVSHKKGKYYSTFIFRC IHSLAGIAFTFFLC EHLFTNMLASSYFS 
CGKGFVAMVNGFHKIPGLKIIEVAGLVLPFLCHAI IGIVYLFQGKSNCYSGDGSRPHLRY 
AKI^KSYTV^RWrAWILLFGIAFHWH 

KGHLTLNLPNTEASSI EVSRHDLGGADAAIJ-SERNSYLLTPSAGTAFLYVVRDALGSLFI 
. AL^ILVIAAAFHGF^LWTFCCRWGVWSLRMC^ 

YSjVA 

CPfi™0789 887436 889316 

sofe-Succinate Dehydrogenase 
OM|SE^RKVI WGGGLAGLS AAMQLANLG I IVELVSLTKVKRSHSVCAQGGINAALNLKPE 
EEt^PYVHAYDTIKGGDFLADQPPVLEMCIAAPRIIKMLDNFGCPFNRGPSGN* 
GTLYHRTVFCGASTGQQLMYTLDEOVRRREHAGRVI KRENHEFVRLVTDHSGRACG 1 1 
^FWNRLEILRGDAVIIATGGPGVIFKMSTNSTFCTGAANGRLFLQGMAYANPEFIQIHP 
T AiPG RDKL RL I S E SVRG EGGR VWVPGDS SKRIVFPDGSERPCG ETGAPWY FLEDMff PAY 
GN LVS RDVG ARA I LRVC EAGLG I DGRMEA YLD VTH L P EKT RH KLEWLD IYKKFTjCEDPN 
TVPm I FPAVHYSMGGAWVDWPAADDPDRDSRFRQMTNI PGC FNCG ES D FQ YHGXNRLG A 
NSLLSCLFAGLVSGDEASRFIEATGASOATSSDFDRALG^EKEENARLLSASGiCENIFVL 
HEETEAKIMVRNVTVKRNNRDLQETMDKLKEFRERIJQW 

LELAJLAITKGALLRNEFRGSHYKPEF PERDDEHWLKTTVAVYAPEEPE I SYIIPVDTRHVA 
PTLRDYTKSSTGKIELTNIPDNIRLPI ' 

CPriy)790 889279 890103 

sdhSs- Succinate Dehydrogenase 
NS RSFL I I S VY P YRKREMMENLETFI LKI YRGVPGKQYWESFELPLbfPGENV I S ALMEI E 
KR PVN I LG EKVN P WWEQGC LE EVCG SCSI LVNGVPRQ ACT AL IQ& I D ATO S RE I VLAP 
LTKFPL I RDLIVDRS IMFDNLERIQG WVAADI EGETFGPQVTQE&ELL YALSQCMTCGC 
CTEAC PQ I DNKSDF I G PAAI SQ ARYFNTYPGDKRSKKRWRALMGKGG I EGCGQAHNCVRV 
CPKKLPLTESISAVGREISKFSLRSLFSALFKKKK 

CPn_0791 893104 890111 

CT590 hypothetical protein 
TC LRS S RK I WEDI SDRNMYSC YS KG I SHNYLLH PMS/LDI FVFDS L I ANQDQNLLEE I F 
CSEDTVLFKAYRTTALQS PLAAKNLN I ARKVANY I l/dNG EI DTVKLVEAI HHLSQCTYP 
LG PHRHNEAQDREHLLKMLKALKENPKLKES I KTLFVPSYST IQNL I RHTLALNPQT I LS 
T I HVRQAALTALFTYLRODVGSCFATAPA I LIHOCYPERFLKDLNDLI SSGKLSR I VNQR 

eiavpinlsgcigelfkplrildlypdplvklssspglkkafsaanlietlgdseaqiqq 
llshqyl^qklqwhetltandiikstllhyyOlqestvraiffkeglfskeqvafstqh 

PR ELSE IQR VYH YLHAYEEAKSAF I HDTQNP/lKAWEYTLATLADASQPTI SNHI RLALG 

wks edphslvslvthfveeeveni r i lvqccegtyhearsqley i eg rm rn p lnnqdsq i 
ltmdhmrfroelnkalyewdsaoekakkf/hlpefllsfytkqiplyfrssydafiqefa 
hlyanapagfrilfthgrthpntwspiy/inefirflsefftstesellgkhavinleke 
ts rlvhn itamlhtdvfoealltr I LEAYQLPVPPS ILNHLDQLSQTPWVYVSGGTVDTL 

lldyfengepltltekhpenphelaakyadalkdlptgiksyleegshsllssspthvfs 
1 1 agsplfre^wdndwysytwlrdvwkqhqdflqdtilpqls iyafi enfcnkyalqhv 
vhdfhdfc5dhsltlpelydkgsrflsslftkdkt/aliyirrllylmvrevpyvseqql 
pevldnvs:; ylg i s:;r ity ekfrsj£ eet i pkmtllssadlrh i ykgllmos yqk i ytee 
l/rv lrlttamrhhnlay rapllf^d.'jnwps i y fcf ilnpgtte i dlwkfnyaclogqpld 

fllOELFATr;HPWT[,YANrtDYC;MpPPPGYRSRLPKEFF 
:'V l tH') hypnt hot L.\i L protein 

RHHL [N [Kf ; [ R [MKHTFTKRffLFFFFLVI PI PLLLfl[>TVVr,FF:ii'SAAKANLVOVLHTRA 
TriL::[EFf:KKLTrHKLFLD/LAhTTI^LKSYAGP:3AEPYAOAYNt>1MAU;NTDF^LCLrDP 
Kr/^rjVRTKrJrr,OPF[RYL/OHPEMKKKLSAAV^KAFr.LTTr<;KPLUiYL[LVEDVASWDS 
•[TT:;GLLV:;|-YPM::i-Xo/DLFQSLlf[TKGNICL.WKY'Jl::VLFf;AOD:":E;'^FVFSLDLPNL 

[•Ofqar:ji*:;a i e i eka/c, r lgcenl itvs enkkryl/;lv[ ,nk r r ro^rYTUXVPVSDL i 

0::ALKVPL.raCFFYVy!l\FLL>!WWIF^KINTKLNKPLOF.LTR:MK-V\WR( -NflNVRFEPOPY 



CYEETYEUJNIFNrTLLLLLf^^HDYI^mvK^^ 

V^LOKrG^ DT^S^CKTTEO^A^VAhfrF I KYVEKD^'XELLJLJEGAPTMFLQRGESFV 
RLPLETHQALQPGDRL rCLTGCEDlLKYFSOLF IKLLKDPLNPLNTENL'DSLTMMLNN 
ETEHSADGTLTILSF.'j 

CPn 07^3 8'»SH38 8 ( >4iL. 

rbstJ-siTin.i reril.ir.or/ family nr/t*»in-PP.:c pnosphat^se (RsbW 

' ,:ir '' ,I,,:! ;.'l.L ..... v - v ./...., ^ ... • . v! . ;.\7'-~ K-N'vVFKA 

^LTOIVPLNVDVLGLFSDVLDLDAG I pApNV^LJNELMCKVFQGIYNEISLIKVFPNCD 
KIWASS I PEHLGENYNHK I D I PKNT P/LAALKQSPKNQEVFSVMQANVFDAKTQELCC I 
LYTTFSAESLIJCDLLINKGSYLTVKT/iLSKYGVILK^ 

FLNDDPCPIDSELGPLTLSPLDIGEyFYSFK I KDTE IWGC I ENVPS IDIAVLSYAKKEES 
- FAPLWRRARMYTAYFFC ILLGS L1XF I VARRLSLP I RKLATAMI ESRKNKNCLYTDDSLC 
FE INRLGH I FNAMVENLHKQQH LJKTNFEMKENAQNALHLGEQAQQRLLPNTLPSYPHIE 
LAXAYIPAITVC^DFFDVFVVGroSKARLFLIVADASGKGVNACGYSLFLKNMLRTFLSR 
SS SLCOAIQETSRLFYN^KNS^FVTLCVYCYHQTSNTMEYYSCGH PPACYLDPDGETS 
WLFHPGMAWFLPEVANITS^FHPKPGSLFVLYSDGITEAiiN^DMFGEERI^AAICC 
LTGKSAAI)AVHRLMLSVKTE i vGNSHQHDDITLLILKVLES 

CPn 0794 /897123 898004 

No robust homolog/present in Genebank/EMBL as of 11/7/98 
KS SKH RS FLLKKSGGNOVSLYQKWWNSQLKKS LCYSTVAAL IFMIPSQESFADSLIDLNL 
GLDPSVECLSGDGAR6VGYFTKAGSTPVEYQPFKYDVSKKTFTILSVETANQSGYAYGIS 
YDGT ITVGTC S IjG AJwCYNG AXWS ADGTLT P LTG I TGGT S HT EARA I S KDTQV I EG F S YDA 

sgqpkavqwasga/rtvtqladisggsrssyayaisddgt i ivgsmestitrkttavkwvn 
wptylcttlggd/stglyisgdgtviv^aaotatv^ 

CPn 0795 / 898008 899195 

No robust /iomo log present in Genebank/EMBL as of 11/7/98 
GTLGGANS^TGVSS DGSVI VGQAQTADKSVHAFQYYNG EMKCLGT LGGTS STAKTVS PD 
GKVIMGRi6IADGSWHAF^CHTDFSS^WVLFDLD^^ l YlCTLR^JGRQLNS I FNLQNNWLOR 
AS DHEFTCFGRSN I ALGAGLYVN ALQNLPSNLAAQ Y FG I AYK I R PK YRLGVF LDHN FSS H 
- VSHNRLWMGAF IGWQDSDALGS SVKVS FGYGKQKAT ITREQLENT EAGSG ESHF 
EGVA^IEGRYGKSLGGHVRVQPFLGLQFVHITRKEYTENAVQFPVHTOPIDYSTG^ 
GIGS^IALVDSLHVGTRMGMEQNFAAHTDRFSGSIASIGNFVFEKLDVTHTRAPAEMRVN 
QSLNLILRVNQQPLQGVMGFSSDLRYALGF 

vr.^0796 899280 901340 

jio robust homolog present in Genebank/EMBL as of 11/7/98 
/sELYSSYLQPCLNMSIVRNSALPLrcLSRSCTFKXVRSHMKFMKVLTP 
LLTAIPGSFAHTLVDIAGEPPJiAAQATGVSGDGKIVIGMKVPDDPFAITVGFQYIDGHLQ 
PLEAVRPCCSWPNGITPDGWIVGTNYAIGMGSVAVKWV^KVSELPMLPDTIXIS^ 
VSADGRVIGG1^I^JLGASVAVKV^DVITQLPSLPDAMNAC^^JGISSDGS I IVGTMVDV 
SWR^TTAVGWIGDQLSVIGTLjGGTTSVASAISTDGTVIVGGSFJJADS 
IGTLGGFTSIAHAVSSrX3SVIVGVSTNSEHRYHAFQYADGOMV^ 
i DGKVrVGRACVPSGDWHAFLC PFQAPS PAPVHGGSTWTSQNPRGMVD I NATYSSLKNSO 
I QQ LQRLL IQHS AKVES VS SG APS FTSVKG AI S KQS P AVQNDVQKGT FLSYRSQVHGNVQN 
/ (X)LLTGAFMDWKIJ^APKCGFKVAIJrfGSQDALVERAALPYTEC^ 
' RYDFT^LGETIVVLOPF>KJIQVLi^LSREGYSEK>r/RFPVSYDSVAYSAATS 
PKMSTAATLGVERDLNSH I DEFKGSVSAMGNFVLENSTVSVLRPFAS LAMYY DVRQQQLV 
TLSWMNQQPLTGTLSLVSQSSYNLS F 

CPn_0797 901552 902694 

No robust homolog present in Genebank/EMBL as of 11/7/98 
VLILTOINVLTICLGLNMSKKIKVLGHL^^ 

EDV^YTFTDLEI^KEGWSEAHAVSGNGSRIVGASGAGCKSSVT-AVIVrtSHLIKHLGTLG 
GEASSAEGISKTX3EVVVGWSDTREGYTHAFWDGRDMKDLGTIXJATYSVARGVSGDGS I 
VG VS AT ARG EDYGWQVGVKWEKGK I KQLKLL PQG LWS EANA I S EDGTV I VGRGE I SRNH I 
VAVKWNKNAVYSLGTLGGSVASAEAISANGKVIVGWSTT^ 

GGGFSVATGVSADGRAIVGFSAVKTGEIHAFYYAEGEMEDLTTLGGEEARVFDISSEGND 
I IGS I KTDAGAERAYLFH I HK 

CPn_0798 902810 903856 

No robust homolog present in Genebank/EMBL as of 11/7/98 

WFE 1 1 FWRVPMKKTCCQNYRS IGWFSWLFVLTTQTLF AGH FID IGTSGL YSWARGV 

SGDGRVWGYEGGNAFKYVDGEKFLLEGLVPRSEALVFKASYDGSVI IG I SDQDPSCRAV 

KWVNGALVDL/3IFSEGM0SFAEGVSSDGKTIVGCLYSDDTETNFAVKWDETGMVVLPNLP 

EDRHSCAWDASEDGSVIVGDAMGSEEIAKAVYWKDGEQHLLSNIPGAKRSSAHAVSKDGS 

FIVGEFISEENEVHAFVYHNGVIKDIGTLGGDYSVATGVSRDGKVIVGHSTRTDGEYRAF 

KYVDGRMIDLGTLGGSASFAFGVSDDGKTIVGKFETELGECHAFIYLDD 

CPn_0799 905001 903940 

No robust homolog present in Genebank/EMBL as of 11/7/98 

KREENMAAIKQILRSMLSQSSLWMVLFSLYSLSGYCYVITDKPEDDFHSSSAVKWDHWGK 

TTLSRLSNKI^SAKAVSCTGATTVGFIKDTWSPTYAVRWNYWGTKELPTSSWVKKSKATG 

ISSDG3riAGIVENELS0SFAV*TWKNNEMYLLPSTWAVr;SKAYGISSDGSVIVGSAKDAW 

SRTFAVKWTGHEAOVLPVGWAVKSVANSVS ANGS 1 1 VGS VQDASG I LYAVKWEGNT ITHL 

GTLGGYSAIAKAVS^GKVIVGRSETr^GEVHAFCHK^'/MSDliGTLCrcSYSAAKGVSAT 

GKV rVCMSTTANGKLHAFKYVGGRM I DLGEYSWKEACAJIAVS I DGE I TVGVOS E 

CPn_0800 906550 90524'j 

eno-EnoLjse 

RKEIKIMFEAVIADICAREILDSRGYPTLHVK'VTTGTCrr/GEARVPSGASTGKKEALEFR 
DTDGPRYCGKGVLOAVKNVKEILFPLVKGCSWEQSLILSLMMDSDGSPNKETLGANAIL 
GV^LATA! I AAAATLRR PLYRYLGGCFACSLPC PMMNL It I^MHADNGLEFOEFM I R PIGA 
3S IKEAVNMGADVFHTLKKLLHERGLSTGVGDE^JGFAPf JLASNEEALELLLLAIEKAGFT 
P(jKD I r; L/\LDt.'A/\S: - FYNVKTCTYDGRH Y EEQ lAILnHL'JDRYP I DM I ErOLAEEDYDGW 
ALLTEVU ;EKVO T VGPDL.FVTNPEL I LEG I SNGLANSVL I KPNO I' "I'LTETVYA I KLAQM 
AGYTT I irilRr.r.irrrp-rTlADLAVAFNAGQIKTG^LCKr.-RRVAKYMRI WEtREELC'SEAI 

FTD:;rrvi-:;Yi:D:;KF 

f.'PnJiHOI 'tOS70*> 'H)C,127 

uvili KxinuiM.-.t:-..- AUC Suburi i t. U 

1 1 prM-n-VLiiAi -!-,M-i 'Givt^KA [ APL:;Ac;vRM07K.wr .um\ v.rvFv i amwanvtil 

r'l'l.VtAIINKTLAV.M.YOKi-'KKKFPtiriAVEYFI rjYYDYYOi'EAY I Al'.:;i 7T\ I KKJILL TNDF, 

[dki,I'.l:;atr:; i i.f.Kum'L i v.:::vr.< : i yg ic/.i penyt^jmalvi ,ev ;k ky vuti r ltaolvk 
MHYfjA:;p[[\n<::At-RKi{« ;::v I nm-AYF^KLALRLEFUiDTLT:: [KYsnn.rM i i-kf.::vp 

DNVWW :::ilYV { PEA I Ul-^A I HT VjEIA.KE RMAFFDDRMEKPK t FIIKTmr* I I'M IKK IG 
FCKfJIKNY::RHr!i;Ani;A!'|-ri;f,I.l//FIM::DFLL[ [DF-^IIVrLlXf [PAMYRClXjllRKO:;!* 
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VEY( --FR LIV; AFDNRPLTYEEAOKYFRK-y I YVG ATPGD^^«CH 1700 1 r RPTC I PDP 
MPErRPATfWVDDLLEEIFLRLSOKHEKILVISITKK^^BbFLSELEIPAAYLHSCI 

ETAERTQ I 10VL rCVNLLRECLDLPr/SLVWWADKECFLRSTSSLIOFCC 

RAARN [ NCKV I FY ADQKTRS 1 EETLRETERRRQ IQLDYNKEHNIVPKP I I KAIFANP ILQ 
TSKDSESPRKORPLSKEDLEEOrKK-ZEALMORAAKEFRFNEAAKYRDAMQACKEOLLYL 
F 

CPn_0802 009761 90R70? 

Tr/pr.' .li.ir.v ! : i'.'IA .'yt j..\r->: 

!AnvMNKKKPv:;:-';;/i'i'r';KLHf/;ffWVt ir:;RLEUju::pwzy:i-y • :aol!!t:.ttk 

r RK EEVLDVDNH I YEVLADWL3 VG IDPTKS 1 1 YLQSAIPEIYELHLLFSMLI SINRVMGI 
PSLKDMAR^IASIEECSI^GLIGYPILQSADILLAKAQFVPVGKDNEAHVELTRDIARNF 
NR L YGQVF P E P EVLQGELTS LVG I DGQGKMS KSANNA I YLSDSDAT ITEKVRKMYTD PNR 
IRATTPGRVEGNPLFIYHDIFTJPHKDEVEEFKARYRQGCIKDIEVKARXAEELIHFLKPI 
KERRSEFLSKPLALOf^EDGTHKMREVAKVTMEEVHDKFGFSHKWRSUK 

CPn_0803 910306 909752 

CT584 hypothetical protein 

FmAKTKTLELEDN^LLLEG^KRIFATPIGYTTFREFQrAA^CANGWEIANFFFEM 

LINGKLTQELAPQQKQAAHSLIAEFMMPIRVAKDIHERGEFINFITSDMLTQQERCIFLN 

RLJ^VDGOETLIJTrDVQOTCHLIRHLLARIX£AOKNPW 

TKALQ 

CPn_0804 911074 910310 

gp6D-CHLTR Plasmid Paralog 

E I FSSMGNLKTLLESR FKKOT PTKMEALARKRMEGDPSPI^VRLSNPTLS S K EKEQLRHL 
LQHYNFREQIEEPDLTQtXTTI^AEVKQIHHQSVlJ>tGERITKVRDL[JCSYREGAFSSWLL 
LT YGNRQTP YNF LVYY ELFTLL PEPLK I EMEKMPRQ AVYTLAS RQG PQEKKEE 1 1 RNYRG 
ERKSELLDRIRKEFPLVETDCRKTSPVKQALAMLTKGSQILTKCTSLSSDEQIILEKLIK 
KLEKVKSNLFPDTKV 

CPn_0805 911846 911067 

minD-chromosome partitioning ATPa s e— CHLTR plasmid protein GP5D 

GYASRMKTI AVNSFKGGTAKTS TT LH LGAALAQ YHQ ARVLL I DFDAQANLT SGLGLDPDC 

YTJSLAVVI^GEKEIQEVIRPIQDTQLDLIPADTWLERIEVSGNLAADR 

VQDKYDYVI IDTPPSLCWLTESALIAADYALICATPEFYSVKGLERLAGFIQGISARHPL 

T I LGVALS FWNC RGKNNS AFAEL I HKTFPGKLLNTK I RRDITVSEAAIHGKPVFATSPS A 

RASEDYFNLTKELLILLRDI 

CPn_0806 913816 911867 

tWf$-Threonyl tRNA Synthetase 
NSSIlfteSPPNMEAWNKMIQVTCDQ 

TH@EGDTLVFLTSEDPEGREI FLHTSAHLLAQAVLRLWPDAI PTIGPVIDHGFYYDFAN 
LS^ESDFPLIEDTVKQIVDEKLAISRFTYGDKQQALAQFPQNPFKTELIRELPENEEIS 
AY§^EFFDLCRGPHLPSTAHVKAFK^TSAAYWRGDPSRESLVRIYGTSFPTSKELRA 
HLEQI EEAKKRDHRVLGAKLDLFSQQESS PGMPFFH PRGMI VWDAL I RYWKQLHT AAGYK 
E IbKF PQ LMNRQ LWEVSGHWDNYKANMYTLQ I DDEDY A I KPMNC PGCML YYKT RLHSYKEF 
PtRVAEVGHVHROEASGALSGLMRVRAFHQDDAHVFLTPEQVEEET LN I LQLVSTLYGTF 
GEEY^LEI^STRPEKDTIGDDSLWEIATDALNRALVOSGTPFIVRPGEGAFYGPKIDIH " 
DAilQRTWQCGT IQI^FLPERFELEYTTAOGTKSVPVMLHRALFGS IERFLGILIENF 
RFPLWLS P EQVR I ITVADRH I PRAKELEEAWKRLGLWTLDDS SESVSKKIRNAQNMQ\j 
Y^I^DHEINENVLAVRTRDNRVINDVSVERFLOTILEEKNSI^LTAIi 

Cf|£l0807 913950 914879 

CT'580 hypothetical protein 

TLQTGLHMSLFLVFLTAF I WS S S FAL SKLVMNAS AP I FATGARMVI AGAI LALAAWFRG 
FVGISKKIFLYIVLLALTGFYLTNIFEFIGLQSLSSSKTCFIYGLSPLMSALFSYIQLKE 1 
KWEKKVLGLSIXLVSYICYLTFGG<X3DDSQPVm^IGLPELLILGAASL^ 
I EiCQSTLSVT A I NA Y AML I AGMLS IMHSAWEPWRPLPVQDISQFLYATLALWIsifclC 
YNL^AKLLRKYS STFLSFCNLVMPLYSGFYGWI LLGEKGVSLGLVLAVAFMVAGCSt IYH 
EEFRQGYIVS 
if""" 

CP|^0808 916398 914956 

CT579 hypothetical protein 

LK^SWALKSLKRMPQSAEPSIJWIKPIIFKGACIAOTSGVSGSSSODPTyAAOLAQSS 
Q^^AQSGHDTKNVTKGXjAQAEVAAGGFEDLIQDASAQSTGKKEIATSSTT^SKGEKSE 
KSG|SKSSTSVASASETATAQAVQGPKGLRQNNYDSPSLPTPEAQT ING IJFLKKGMGTLA 
LLGIiVMTLMANAAGESWKAS FQSQNQA IRSQVESAPA IG EA I KRQANHQASATEAQAKQS 
LI SGI VNIVGFTVSVGAG I FSAAKGATSALKS ASFAKETGAS AAGGAASTCALTSASSSVQ 
QTMASTAKAATTAASSAGSAATKAAANLTDDMAAAASKh^ 

WS EKVS RGMNWKTQG ARVAS FAGNALS SSMQMSQLMHGLTAAVEGL^AGQTG I EVAHHQ 
RLAGQ AEAQ AEV LKQM S SVYGQQ AGO AGO LQ EQ AMQS FNT ALQT LQ1QI ADS GTGTT S A I F 
N 

CPn_0809 917794 916307 

CT578 hypothetical protein 

DTNMSISSSSGPDN0KNIMS0VLTSTPCX3VPO^DKLSGNETK(|flQ0TR0GKNTEMESDAT 
IAGASGKDKTSSTTKTETAPC^VAAGKESSESOKAGADTGy^AAATTASNTATKIAMQ 
TSIEEASKSMESTLESL0SLSAAQMKEVEAVWAAL3GKSSGSAKLETPELPKPGVTPRS 
EV I E IGLALAKAIOTUjEATKSALSWASTQAQAD^TNKIjGLEKQAIKIDKEREEYQEMK 
AAEQKSKDLECTMDTVNTVM I AVSVA ITV I S I VAA I FTCEAG LAGLAAG AAVGAAAAGGA 

agaaaattvato itvoawqavkqavitavrqa ITAA Ib&WKSG I KAF iktlvka iaka 

ISKGISKVFAKGTCMIAKNFPKLSKVISSLTSKWVTVGVGVWAAPAIjGKGIMQMOLSEM 
QQNV AO FQ K EVG KLQ AAADM ISM FTQ FWQQAS K I AS KQTG ESN EMTQ KATKLG AO I LKA Y 
AAISGAIAGAHKTNNF 1 

CPn_0ai0 9 IS 193 917325 

OT577 hypothetical protein i 

C^EIWJKKI J KKTKKAVOSKAAPVKRVPEESOEAA/oQLELAVSDLYKELrLAOTFASLTDK 
NOmSI lAALOTLEriLHf^ELTOGLFr^AOE/.ANFAKEL^SWir.LKNLTTVVNKOI^VK 

CPnJJHl 1 'US 1 .'),) UHiGOS 

U:tH-L'.w 1 1 1 U.v.pon;:^ Prnr.,.u n / 
' LgNNFT [; ; 1 l<: :M::Kl , Sf7<NANOP'jKP:^\/RIKKTR:;RLAi;LAAUKKAKADrjLK<JVJIPVPT 
KEE r KKAU IN t mil ,:;Nf ILDLOO L LGLj^YLLEEI YTVAYTFY! :<> IK Y N EA'/ r l LFO L LAA 
Aurw/KYMU;[.:::\:YrlO[.MLYNEAA^FFUVFDAOPDNP[PPYY[AP^^XKL0OPEE^N 

nfi .nvi'Mi j n v inn i -i:fk r lkercq imRo:i r ekoma^etkkaitkkpai :k : :ktttnkk:;gk 



mijr.L-DNA Mismatch 
0 1 L ICWLCNLTKAPM3TRR ^^^)PLT }Uq I AAGEV : EN:;v:JWK EL [ EN3LDAGADEI 

ei etlgccogai : irdnccgfraeoi f 5ft lor hats k i ref::d r fslnsfgfrcealps i 

A3I3KMEIQS3IECD ECVRTV t HGGO Lv SC E PC A RO LGTTV I VNS L FYNV PVRRGFQKSM 
Q3 DR LG I RK L I ENR I LSTAN I GWSW IE ECHH E 10 t A KQO( j FQ ER VA YVMG DH FMQDALT I 
DKEANGVR IVCVLC3PSFHRFTRQG0K I F INDRP I ESLF I 3KKVGDAYALLLPLHRYPVF 
VLKLYL PSSWCDFNVH PQK I EAR I LfcEELVCDC I KEA I VETLACPPC I LCRTHQE I EES D 
SVPLPMFRMLETSDVQEEESVEFDONLFAYSSEDV'SLEKOEYT^RGPKSOMDWIYSSDVR 

". r .T:"*:i'"/:.Ai-:i/!."';v:!: r r~r.^,x f- " i-"r * : ::--vv.. ;vt?"\ 
;■■».-: :i iif' if?* .i Tf /:•■;:.- w, ' /vf. • : -r, tv : ti i-:w: .; .: .; .aai-- : n:;ea- : 

ALMKETLTQATFSKHQH VFDVSWLKLLW JVfJK F EK<JK LOAK I RRL I LDSDFMEG 

CPn_0813 9^6843 921934 

pepP-Aminopept idase/P 

TL I LWKDNHMSHDR I LRAJRALS EHNLDA ILVEKSEDLAYFLHDEAIAGILL IGQQ EVMF 
FVY1WKDLYSHIQRVPDTFLTQDWAOLSLYVQKQRYQKIGFOSASTVYHKFAQR0VLP 
CLWE PL ECFTEK IRS 1Kb EEE I RRMQ EAAALGSAGYDYVLTLLR EG ITEKEWRQLRAFW 
AEAGAEGPSFPP 1 1 AFCEHSAFPHS I PTDRPLKKGD I VL I D IGVLLNGYCSDMTRMTALG 
TPHPKLLE^YPVVVEAQKRAMALCKEGVL^DIDAf^VHVLREHHLDTYFIHGIGHGVGR 
H I HEY PCS PRGSQVtfLESGMT ITVEPGVY F PG I GG I R I EDTLC I DKNKNFSLTARPVI S E 
LVCL 

CPn_0814 / 921996 923357 
CT814.1 hypothetical protein 

FFLFFKLSYNBaFNLPLTMYQLLS IGYSFVSFIALLWMLCYSPNYVTDLYRISLSAEESL 
CCIRAFP0A£6LLGGACAIJ^FPDLEERLPDLRKELLFIXSNDRPDACGGKFSLOLASSKE 
CTIAALKFJy!mJJVTNSSRGFWSFSPKGVPTELWIE 

ISKPRIXEYLFIiNPPANKLDCWEIAGFRVDASFPVKQKIRKIGVDKFIXMHGGAEYADKA 

TKEJ^VDFysSDEFJ^SRYLAVT3DVLLWDGNCW(7TCGEFC<3ASSR^ 

DLWNVGGTQRQT I SLVKGVPS P I E INEVI RE I EFTGMRSWSKP IVLVCGORL ILS PDDWV 

LRTAKGwEKLSRADQ I Q D YVTG KVTG P LL VF E KX EKDLRG FVLRG HMFN AQRT L VET ISL 

PLKQGF^E PAV ASQ EVS SNTRS AAAH PGATNRGGS 

CPn/6815 923361 925622 

gsDD/pilQ-Gen. Secretion Protein D 

FF RN S LLH LVALSGMLC C S S GVALT I AEKKASLEH SG RGADDY EGMA S FNANMR EYS L 
O^SKXYEEARKLRASGTEDEALWKDL I RR IGEVRGYLRE I EELWAAE I R EKGGNLEDYAL 
HPETTI YNLVTDYGTEDS IYLI PQEIGAIKIATLSKFWPKESFEDCLTQ ILSRLG IG 
T?QVNSWI KELYMMRKEGC SVAGVF S S RKDLEALPET AY IGFVLNSNVDAHTNQHVLKKF 
aNPETTHVDVI AGRVWI FGSAGEVGELLK I YNFVQSES I RQEYRVI PLTKI DPGEM I S I L 
NAAFREDLTKDVSEESLGLRWPLQY0^3RSLFLSGTAALVQQALTLIRELEEGIENPTDK 
TVTWNVKHSDPQELAAIXSQVHDWSGENKASVGAATC I Q I DTTVSSSAKD 

GSVKYG^IADSKTXH'LIMVVEKEVLPRIQMLLKKLDVPKKMVK 
GLMXilLGEEVCKKGCSPSVSWAGGTGILEFLFKGSTGSSIVPGYDLAYOFLMAQ 
NASP SWTMNQT PAR I AWDEMS I AVSSDKDKAQYNRAOYG I M I KMLPVINVGEEDGKSY 
ITLETDITFDTTGKNHDDRPDVTRRNITNKVRIArc 

DIPGIGKLFGMSSTSDSLTEMFWITPKILENPVEOTERKEEAIXSSRPGEREEYYOALA 
AS EAAARAAHKKLEMFPASGVSLSQVERQEYDGC 

CPn_0816 925600 927102 

gspE-Gen. Secretion Protein E 

RGKNTMAASILSQEIXDILPYTFUCKHCLLPIEESSEAITIAHATATSVIAQDEVKU.IK 

KPVRFVLKEES E I LQRLQQ LYSNREGNVS DMLLTMKEEDGTT I SEEEDLLETTDT I PWR 

LiL^IIJCEAIEERASDIHFEPCEDSMRIRYRIDGVLHDRHSPPSHLRSALTTRIJ^ 

DIAEHRLPQDGRIKIHIGGQEVDMRVSWPVIYGERVVLRILDKJ^WILDIAGLiW 

EI LFKDT IT APEG I LLVTG PTCSGKTTTLYSVLQELKGPLTNIMTI EDP PEYKLPGIAQI 

AVXPKIGLTFARGLRHLLRQDPDIU1VGEIPJDQETAEIAIQAALTGHLWSTLHTNDAIS 

AI PRLLDMG I ESYLLS ATLVGWAQRLVRT IC PYCKVAYT PENQEKS FLASLGKDT EMPL 

YRGQGCVHC FRSGYKGRQG I YEFLRPNTLFRS EVASNRPYH I LRETAEQNGFLP I LEHG I 

ALAVSGETTLAEVLRVTKRCD 

CPn_0817 927106 928287 

gspF-Gen. Secretion Protein F 

GGRMPRYRYTYLDPKERRKRGYLEALHIQEAREKIAQENIOVLDIREVALRRMSIKSTEL 
IVFTKQLLLLLRSGLPLYESLVSLRDQYHEQKMGLLLTSFMETLRSGGSLSQAMAAHPNI 
FDHFYCSGVAAGESVGNLEGCLQNIIVVLEERAQITKKMVGALSYPCVLLVFSFAVMLFF 
LLGVIPSLKETFENMEVKGLTKIVFGVSDCLSAYRYLFI^FASALITVGILMRHRIPWKK 
. I LEKLLFAL PGT KKFWKVAVNRFCSVAS AI LKGGGTL I EG L DLGC DAI PYDRLKT DMRO 
IVQAVIGGGSLSQELAQRSWPKI^IGMIALGEESGDLADV^GYVAHIYNEDTQKTLASI 
TSWCQPVI L I FLGGL IGVI MLA I LI PLTSN IQTL 

CPn_0818 928158 928682 

predicted OMP [leader (16) peptide] 

GYTKNVGFDNWVSTRDSDFSWWPDRCDH VGN I DPTHKQY PN 1 1 KCVLRGVGMKRQK.RKQ 
S ITLI EMMW ITL IG I IGGALAFNMRGS IHKCKVFOSEQNCAKVYDILMMEYATCGSSLK 
EI IAHKETWEEASWCKEGRKLLKDAWGEDLIVOLNDKGDDLVIFSKRVOSSNKK 

CPn_08L9 929117 928956 

CT568 hypothetical protein 

ASLYGYCLFLIWEKFHNNtGKANFHLKI ITTDFLTDIYIVTIRDPIAYPLTGIC 

CPn_0820 929042 929659 

CT567 hypothetical protein 

DESLPCRCCCCTFPRSETSSIRTEWPMCNSIAMKKOKRGFVLMELLMSFTLIALLLCTLG 
FVA'RKIYTVOKOKERIYNFYIEESRAYKOLRTLF^MSLSSSYEEPGSLFSLIFDRGVYRD 
PKLJ^AVRASLMHCTKDORLELRICNIKDOSYFETORLLSHVTHVVLSFQRNPDPEKLPE 
TIALTITREPKAYPPRTLTYQFAVGK 

CFn_082l •);>-.Kit7 'ilOfSCH 

i:TSr,*i hypor.h.M' u:,j t (J ror.(;in 
■ HTNI.RU^NKPMOPK I FTf .LOLT::t .VITI/ZAFDAAf IARKRCACAQT I ERGF-NFFf; [KRSACA 

ei eyoeksrha:;a i er r tjk dkc ikvtpko i akvatkkkor yrllvv pf.'jr n 'Nn.^r ynlya 

L,L:' E P f J EC* Y : ' DT A: I K [ RLLRRAr/DTntP/F'PnfiEYA I ANAL ir.NKQE ILERGAQLG 
l1>VL^rL-arFP^AE[FYKML.Kf;:-;N:;f;:;LLNKLI^EEK:;LTJ[c:KLN[.rF-'MDr[XLEAVL 
IMll1»AYRPr::[.LRr\;rWFAVKr'.OKMAIOKnrw^LEL.FKTRTDFRIXLKnKMOLLLSRY 

iM.Li'LLNKKMrwri*;::Af;itYr.i-"i.vi>r-i/rKA[:;K:r«:p;;K::[KL 
i'I'h.oh:!.: txyCfafy-mu/r, -ml.::^ 

i"t"it.', tiy|H>l hoi i.M | [n^t ... i ri 

F FL [ I VL l:IT I KN I t ( WMAUFTI 'K Kt I: V. IKK.'.'.'.'.'IOFD.' Il.KRKVKDLH.'INPKV'IKWKKFL 
:;HHA':KAI^ICI.VF.VM[ fAl>Fl:.-WA*V;i.l- , [A(-';7 1 /l;:KIIVErRKMI.:;NI^::Y::[AIWPlK 
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NA r LO ;L 11. KKVLN t PSFAVSF I VT/TV r LSF TTTA PS< 



IHCDKHQDTSNKPS 



. f:Pn_082 J '» 32424 03 1501 

yscT/spriP -YopT Tran location T 

FYAUJVRFSKTSrNCNKELMCrCLPELFSNLGSAYLDYIFQHPPAYVWSVFLLLLARLLP 
CFAVAPFLCAKLFPSPIKIGISLSWLAr IFPKV1ADTQITNYMDNNLFYVLLVKEMI IGI 
V I G FVLA F P FY AAQS AGS FITNQQG r OC L EGAT3L I S I EQTS PHG I LYH YFVT I 1 FWLVG 
'WRIVI.SLLLOTr^TPIHSFFPAEMMSLSAPrWTTMIKHCOLCLVWrrOLSAPAALAML 

m: :i : i ; v.i r-i\r vv*// 1 v: .; .. :.m .kakh' ;[.:,;•*:.: :..y.\wf r ; Kg 1 0Y.-Tu\WFKi -r/p emi. 
;.■ :.:rnv.':. 

CPn_0824 932677 932379 

yscS/f liQ-YopS/f liQ Translocation Protein 

IRTRAVLAFFATSFKSVLFEYSYQSLLLILIVSAPPI ILAS IVGIMVAI FQAATQIQEQT 
FAF AVKLW I FCTLMI SGGWLSNM I LRFAGCI FQNFYKWK 

CPn_0825 933618 932677 

yscR-Yop Translocation R 

ER IKVFTIMRS I FRFSLXrFFTLSVSCCFADASLYENSCPSRCQPTPPPSNSNPLNWQQP 
VAASSVPSYNPPL^ADDVLPRI)HLSDGSFSI7rYPDITTQAriLIFXJU>SPFLVKLLTSYL 
KI I ITLVLLRNALGVQQTPPSQVLNGIALILS IYVMFPTGVAMYKDARKEIEANTIPQSL 
FTAEGAETVFVAI^IKSKEPLJlSFLIR^PKAQIQSFYKrSQKTFPSEIRAHLTASDFVII 
I PAF IMGQ IKNAFEIGVLIYLPFFVI DLVTANVLVAMQMMMLS PLSISL PLKLLL I VMVD 
GWTLLLQGLMISFK 

CPn_0826 934382 . 933612 

yscL-Yop Translocation L 

HD^SGVFSSEWQPQRYYAIVKMKFFSLIFKDDDVSPNKKVLSPEAFSAFLDAKELLE 
KTKADSE^WAETEQKCAQIRQE^DQGFKEGSESWSKQIATLEEETKNLRIRVREALVP 
LAI ASVRKI IGKELELH PET I VS I ISQALKELTQNKHI IISVNPKDLPLVEKSRPELKNI 
VEYADS LI LTAKPDVTPGGC 1 1 ET EAG 1 1 NAQLDVQLDALEKAFST ILKAKNPVDEPS ET 
S S STDSS SLSNDQDKKE 

CPn_0827 935273 934434 

CT560 hypothetical protein 

GCLVTAOTFGTLDILMKHSKEDDLSRFXPKNLLVESPHPEEIPIiCSL5FTMSWLPTIHPS 
' WITIAMKEFPPEIC<MLLAWLPEPLVQEILPLLPGISIAPHRCAPFGAFYLLDMLSKKIR 
PCGITEEIFLPASSANAILYYTGPWIALINCI^LYSIAKELXHILDKVVIERVKNALSP 
TEKLFLTYCQSHPMKHLETTNFLSSWTTDAELRQFVHKG^LEFLGKALTKENASFLWYFL 
RRI^VGRAYIVEOTU<TWYI)HPYVDYFKSRLEX3CMKVLVK 

Cpjy0828 936292 935267 

yscJL-Yop Translocation J 

IKR^AWIMVRRSISFCLFFLMTLL^CTSCNSRSLIVHGLPGREANEIVVLLVSKGVAAQK 
L PQAAAATAGAATEQMWDI AVP S AQ ITEALA I LNQAGL PRMKGTSLLDLF AKQGLVPS ~ ~ 
QEH^YQEGLSEQMASTIRKMDGVVDASVQISFTTENEDNLPLTASWIKHRGVUJNR^ 
I^SKIKRLIASAVPGLVPENVSWSDRAAYSDITI^PWLTEEIDWSVVTCIILA^ 
LT&^IFYVLILILFVISCGLLWVIWKTHTLIMTMGGTKGFF^ G 
AAADKEKKEDADSQGESKNAETSDKDSSDKDAPEGSNEIEGA 



R 



3-> 



CPiyD829 



936729 937298 



No* .robust homolog present in Genebank/EMBL as of 11/7/98 
KYJCj^PTLAKSFYINIPXlSRFYSWLCFIMKE 
FF^AKWPLVPAGYRRVRGKDFVLSPLVDLVIU^PWVTKDSRYSPCSMTFTCICRS/^ 
C I $$A7 STLFG IGRFCAVWCVEGFSGSTFDK I YHT IVAVLG I LGLG ILTFILRIIFS 
PVWFLFKCYS 

CP§J?830 937339 937959 

No? robust homolog present in Genebank/EMBL as of 11 ///98 
DSfS^LPCFEVEAtfTFPQVFSKVWYKYKSSRILLIALLYNITLVL^^ 
GR Vl LK I YQNEEEFFRATERF P S IGAGYLRVRNXNSVLF P FEDLMLVCPSyPKDF PLSAF 
KVfTKLI YWSVLES I PWGAFF FS IGRLFAMWCI EDFPGS I FSRI YHTiyGVLG I LGLG I 
IMFILRI I FTLLTLPFWLISCLKSSAA 

CPn.0831 938249 938434 

Noi^bust homolog present in Genebank/EMBL as /f 11/7/98 
WRgNNVLIPJ(SESEGAFFEATQNYPTIQC£YQLVRIREH^ 

AA*y 

CPn_0832 939750 .938827 

UpA-Lipoate Synthetase 
VMKC RPTLNTDQPRVRKKL P ER F PKWLQR PLPQGSAF H ATOAT I KRSGM PTVC EEALC PN 
RAECWSRKTATYIALGDVCTRSCGFCNIGHSKTPPALDP^EPERIALSAKELGLKHWIT 
MVARDDLEDGGAQGLVDI IQKLREELPQATTEVLASDFQGhA/SALHTLLDSGITIYNHNV 
ETVARLSPLWHKATYA^SMFMLEQAANYLPDLKIK3G4mVGLGEMEGEVKQTLQDLASI 
GVR I VT IGOY LR PSRKHLQVKS YVT P ETFDYYRRVG EAMCLFVYAG PFVRSS FNADH I LA 
SVQDKASA 

CPn_0833 M4 1171 939747 

lpdA-Lipoamide Dehydrogenase 
RG VLFE I L I TVS EMMTQEFDCW IGAG PSG Yv/a I TAAQSKLRTAL I EEDQAGGTC LNRG 
C I PSKAL I AGANW SHIKHAEQFGI HVDG YT ITJY P AMAK RKNTWQG I RQG L EGLI RSNK 
[T VLKGTGS LVS ST EVKV IGQDTT 1 1 KANH^i L.3.TG3 E PR PF PGVP FS S R I LSSTG I LEL 
EVLPKKLAI IGCGVICCEFASLFHTLGVE DTVI EALDH I LAVNNKEVSQTVTNKFTKQC I 
R ILTKAS I SAI EESONQVR ITVNDQVEEEOYVLVAIGRQFNTAS IGLDNAGVIRDDRGVI 
PVDETMRTNVPN r YA IGD ITGKWLLAHVASHQGV I AAKN ISGHHEVMDYSA I PSVI FTHP 
EI AMVCLSLQEAEQQNLPAKLTKFPFKA ICKAVALGA3DCFAA £ VSHE I TOO I LGAYV IC 
PHASSLICEMTl^IRNELTLrCIYET/mAHPTLsEVWAEGALLATNHPLHFPPKS 

CPnJIH M -1/11544/ 'MJOU 

<rr*j r ,*'. hypnt.h,_-r. Um L pmt/in 

K I f 'MPFAK RTEMQRTCWKt .l\G;jV^ jMI I VPO< : PYC:'AFL0DPPVA: "A 'A 1 FSijCH I C FPEGASK 

ko:ardlfav:;:;edwf-:avu ;[>Wptqr:tnkov r pewtwloijwplaalflg lgllafaflil 

I.KlITUJCLVL/I'WI'KNRAYFY/t ICAAVAYRCYRKLPL 

moil :;W[/::Nr t.iniiL^ heliu.ise 
UN l MVf .KAI.A I FRyOAMOH \ ,L,K HR K E T WDP.'EP: "i'T I HI i 'OICKAPEG YWLCTLKLQDrD 
HI/IM^V:'V:i;HX;r-y-i:L/tJn^YFAVYDAI^UirLHLrFHM::iWAVF:;HFFLD^rPLQAO 
< IKMVYTI.ElIPM ITLT/k< 'LSIlEVFQDWLRT NIASEEPTV[TNKTFt,K:]ALYRTAKKFFFL 
MM lAKt.T t' :KN^I--r-::ilF:lLOWr/;LVFKAF [LSFPTLEDI FI'Kl.ELAHTCLENVJHDI 



tTNVTve aeeakvnftls^^Mdr enh pkt/ i c: ; ve yv aktu em :tc pka t al p i y a 

I PLLADKFKDQLLSLLCYD3i^f.R YD I RLLitDAS FSFSAY LVTPGDLDNGSL I YPNYC 
YSPTKGLMQWGMLoPKQAF [VKSEOVEDFLJfERGHLIQEPGFOTFINERPEGHLTYNVT 
EQGVLLFHYDVGDPSGTE I RFGTWTYYTNQGTFLEK KNDLP IQDGLIVEPQD I PAFIVKN 
DAAI^RLPNFFSSPPNLKDLLIEVHRQSRG^CLDLKPILVGLGESRCWLFGVFLYREDIG 
FSLIPTPUX^LCFLPRVIPPENVPQFLTOYAOHERrLFPNPOTRPPESYELVIOSIHRPH 
PAS PLHLQLELKTNLGSVP IG I ALQCLKfi KHTFLFTQAG FLDLKONLFQFLKQFLSTQKC 
VI AENTVIANITDVFKLDALAPU'OTnCT r ANPEDLOFFSOLKAACLPP I PQNLFSSDHO 

i,iw-v;N.'- w.i.'/.MWFi.Y.MMi't.. iktmoatall: 1 :vf;:;. vp: iaftkfl :*.■": E- 

•r.'\" i'::'.-;!-:n::..:Nin.; : - ••.•kj.i ta!" n.:;":- v "rr.P'. , NYr'KFYK:AFT:v 

VFDEIHMAKNKSSQ I HKI LCR I Da/mKLJLTGTPI ENNLLEFKCLLDI I LPNYLPSDALF 
KKLFTKRCSSEELEEI IPSQDLLmCLTRPFILRRTKKLVLPELPDKVESI IACSLSPDQE 
KLYMATLQREKSHIOKLETPEE/aTNFLH I FALLNHLKQ ICDHPAVFFKDPDQYKNYESG 
KWNAFVKLLKESLNAGYKVWB^QY I HM I RI ITLYLEEIGIKYASIQGKSLNRKEEI ETF 
TTDPNCQVFVGS LLAAGTG I nLtaGNW I MYDRWWN PAKENQ ALDRVHR IGQKNTVF I YK 
LI TEDTLEER I HYL I EKKI ftLLDKV I ASQ DSN ILHMLNR EDLLT I LSYKDEHGTSDS EES 
PVDAPVEDDTGVLPPEDS > 

CPn_0836 /946960 945722 

brnQ-Amino Acid /Branched) Transport 

KMKKNASHICTNDKKS/S IWS IGGS I FAMFFGAGNIVFPLALG YHYNAH PWSAYFGMMLTA 
VCVPLLGLVSMLFYSGDYQKFFFSIGRI PGM IFITAIILL IG PFGG I PRAI AVSHATLI S 
LSEHKSAFIPSLP/FSAICCVLIYIFSCKLSRLIQWLGSVFFPIMLVTLLWVI IRSFMI P 
THPMVQEFIP^^ArfQAWLAGFIEGF^^'MDLL^FFFCSIVLISLilOLVA£EKHPTEEEIPL 
S FQG I SKKNKRSTLALGFFLAAI LLCMTYLGFVLSAAJIHAGLLVNVSKGHILGRISAIALG 
PNS I ACLTTEI ALVG I VADFLARWS FKKLNYASAVICTL I PTYLI S I LNFE 
TISHIiLPLWLSYPALIVLACGNIAYKLWNFRYSPVLFYLTLSLTIVLKLVN 

CPn_083j/ 947777 947145 

nth -Enoonuc lease III 

ltmkqmlrti^alfpnpkpslegwsspfqlliaillsgnstdkavnsvtpqlfakapda 
qsiu:/ppgklyqliapcgu;erksayiyqlsoilvrdfhgeppni»ialltolpgvgrkt 

3IAYGKPTFPVimiILRIAQRWKISEKKSPSAA£KDIJUlFFGHENTPKIi4I^ 

yar^ycpalhhkidncpicsylakeanstrt 

CPh_0838 949196 947781 

ndF-Thiophene/Furan Oxidation Protein 
^St^IYPNSFHLFNIJClJGIL^ESSFlIFSIFMLKHDTIAAIATPPGEGSIAVVRLSGPQAI 
f VIACRIFSGSVASFASHTIHLGQVIFEETLIDQALLLLMRS PRSFTGEDWEFOCHGGFF 
AC SQ ILDAL I ALGARPALPGEFSQRAFLNGK I DLVQAEA IQNL I VAEN I DAF R I AQTH FQ 
GNFSKKIQEIHTLIIEALAFLEVLADFPEEEQPDLLVPQEKIQh^HIVEDFISSFDEGQ 
RLACGTSLILAGKPNVGKSSLIJiAIXQKNRAIVTHIPGTTRDILEEC>^ 
AGQRTTDND I EKEG I ERALSAMEEADG I LWVI DATQ PLEDLPK I LFTKPSFLLWNKADLT 
PPPFlirrSLPQFAISAKTGEGLTQVKQALIG^^KQEAGKTSKVFLVSSRHHMILQEVAR 
CLKEAQQNLYLQPPEI IALELREALHSIGMLSGKEVTES ILGEIFSKFCIGK 

CPn_0839 949230 950159 

psdD-Phosphatidylserine Decarboxylase 

FLF I VSRGLVQKPQYI DRITKKKVI EPI FYEKTMLFLYNSKLGKKLSVFLSTH P I FSRI Y 
GWI^RCSWTRRQIRPFMNRYXISEKELTKPA/A^FTSFTIDFFTRKIJ<PEARPIVGGKEVFI 
TPVDGRYLVYPNVSEFDKFrVKSKAFSLPKLUJDHELTKLYAWGSIVFARLAPFDYHRFH 
FPCDCLPQKTRCVM3AIJSVHPI^VKDNFILFCENKRTVTVLCT 

GSIVQTFSPNQTYAKGDEKGFFAFX^SWILLFLPNAIRFDNDLIJCNSRMGFETRCLMGQ 
SLGRSQREEI 

■ CPn_0840 950141 951544 

CT700 hypothetical protein 

ISERi^LXTLKTFFGIAKRDKSQKV^IMWLVILWAI^AASLAIALVAKGYYRFVYFRRYAV 
QVI REVRLSMELKEWALAEQQLLP I LKKRSYRRQCLFEYMR I LRKMQRFEES EKLLAEAK 
KLGLRGPYFFLEIAYKAYRFGAFKECAQAFASVPQDLFEEEDAAKYASALVRLGDLDAAC 
SLIEPWISPLSHQETFVTMGHIYFTSKRYKDAIDFYNRANAL<^CPVEVTYNLAOAYRIT 
SSYAKAGKLFRKLLSNPVYKEEALFNIGLCEOKLGRPGKALLIYQSSDLWSRGDALLMKY 
AAMAAMDQRDYVLAEPCWELALRCST FAKDYKCGLGYGF S LC RLRKYGDAERVYCNL IQN 
FPECLTACKAI^V^GVGYATLLGSEEGI^AXKAVEXDHSCETLELLSACEAKCGNFDA 
AYEIQSFLSSRDTSLOEKQRRSQILRILRXKLPLNDHHIVEVDALLAA 

CPn_0841 951719 954640 

secA-Translocase SedA 

IKRHMLGFLKRFFGSSQERILKKFQKLVDKVNIYDEMLTPLSDDELRNKTAELKORYQNG 
ES LDSMLPEAYG WKNVCRRLAGT PVEVSGYHQRWDMVPYDVQ I LGAIAMHKG FIT EMQT 
GEGKTLTAVMPLYLNALTGKPVHLVTVNDYLAQRDCEWGSVLRWLGLTTGVLV 
KRKK I YQCDWYGTASEFGFDYLRDNS I ATRLEEQVGRGYYFAI IDEVDS IL I DEARTPL 
IISGPGEKHNPVYFELKEKVASLVYLQKELCSRIALEARRGLDSFLDVDILPKDKKVLEG 
I S E FC R S LWLVS KGMPLNRVLRRVREH PDLRAM I DKWDVYYH AEQNKEESLERLSE LY 1 1 
VDEHNMDFELTDKGMOQWEYAGGSTEEFVMMDMGHEYALIENDETLSPADK INKK I AIS 
EEDTLP.KARAHGLROLLRAOLLMERDVDY IVRDDQIVI IDEHTGRPQPGRRFSEGLHOAI 
EAKEHVT IRKESQTLATVTLQNFFRLYEK LAGMTGTA IT ESREFKE I YNLYVLQVPTFK P 
CLR I DHMDEFYMTEREKYHAI VNEI AT I HCKGNP I LVGT ESVEVS EK LS R I LRQNR I EHT 
VLNAKTIHAQEAE I IAGACKLGAVTVATNMAGRGTDIKLDNEAVIVGGLHVIGTTRHOSRR 
IDROLRGRCARLGDPGAAKFFLSFEDRLMRLFASPKLNTLIRHFRPPEGEAMSDPMFNRL 
IETAQKRVECRNYTIRKHTLEYDDVMNKORQAIWVFRHDVLHAESVFDLAKEILCHVSLM 
VASL'/MSDROFKGWTLPNLEEWITSSFPIALNIEELRQLKDTDSIAEKIAAELIQEFOVR 
F DHMVEG LS KAGG EEL DAS A IC RDWRSVMVMH I DEQWR I HLVDMDLLRSEVGLRTVGOK 
DPLLEFKHESFLLFESLIRDIRITIARHLFRLELTVEPNPRVNNVIPTVATSFHNNVNYG 
PLELT/'/TDSEDOD 

CPn_0H42 ^55015 954710 

CT702 hypnth«cLcul protein ( Er.ime -ahi t't with 0843) 

KYYTPPT ICRS PWSN T ALKT I ^EPEYDCNOLLKTQSLLTTtA/DTLLNAPKDFPNSKNQKH 

I LFC I/iffrJTLJUY AOFLIACINRRKFW I RYYNDOVWHEWTPF I 



CT7fj;; hyporhcf: Li: 



hypnrhcf. u.mL protein ( f r.inw: r.hi tt with OH43) 
NKMKLi r-WRFKVMNYrVYm'PrrrDIILOn-TULt.DNSEI^^LDKYOETt^ 
DLLtVl/;F.::VKKtjTIHOI'X.» 

':p»_0fM4 ■i',- ) ::7o 

yplKJ •''JTP.i::e/';TP-LiLiuJiini pi or ..-in 

KNR f ET IMLK t A [ LGRI -NV( ;K:;f:[.KNM ."KH: :i ^ I Vtir.tjFrrYTM)H L.Y' lELIIAFf IVPAQV 
rr/rY/;7MIN:;EDYF0KH[YN0Af:tY;AKI^I)Vr.l.LV[DrW/;[TEEDAIUJ\KLLLPLKKPL 
[LVAf IKADfiROEEt^JIHETYKI aJ 1 77't':;*['AIIDKM [ 1/ri.lfjR [KLVANL.PEPRF.EEEE 
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' ;leeu;vdf.i i eeseaalpsntf pdfjevfteg f:;peee|^^K pqqapktlk i aligrp 

Wr.K'ltn INf ILLNEERCI ID^rxTrTRDTItDILY.-HKC^^BtDTACLRKMKSVKNSIE 

w r sgsrteka i srad i cllv i datqk ljg yekr i ls l ph i i li nkwdlleevrm 

EH YTKDLRATDPYLfiQAKMLC I3ATTKRNLKK I F3A I DELHHWSNKVPT PIVNKTLASA 
LH RNH PQV IOGR R LR E ™ I QKTTT PLQFLLF I NAKSLLTKHYEYY LKNTLKSSFNLYG I 
PFDLEFKEKPKRHN 



' ANf "A >F ! ■•( "','[•' f .ALK I: " " ■!• ; v r:"" 1 A ! KA:V r"/: : K I ,i*f i/V I YOA Y FT ? X'VRDMLMW 
RPLEDIDI ATNAS PT I VST IFPDVI J ICVAFG 1 1 T /KQDGRLFEVATFRSDGEYKDGRHP 
DRIIFSSMREDAIJIRDFTVNGMYYDPFEDKVFDFVEGTRDIEKKVIRAIGHPPXRFSEDK 
LRILRAIRFSSSLGFTLDPTTERAI IKEAPALVNSVSPERIWQELKKMLKRQPYGALSLL 
LKLKVL I FI FPELRDI PYSLLRTT I EFARKFNPTHFPEI LFLLPLFQGVSEEAATVAFGR 
LR I SNKELKL I ESWYEALPHFQNQSGNRVFWAHFLAS PTAPLFLELFSAJjQKDPSRQQHF 
I SRVQELESRLEOF I LR I KTSS PWS APDL I AKG I S PGRLLGDLLREAEILS I ENECLDK 
EKILLLLQEKGFWK 



CPn_0846 



9S9383 958112 



clpX-CLP Protease ATPase 
REHMNKKNLTICSFCGRSEKDVEKLIAGPSVYICDYCIKLCSGILDKKPSSTISSAPVSE 
TPSOPS DLRVLT PKEI KKH IDEYVIGQERAKKT IAVAVYNHYKRIRALLHNKQVSYGKSN 
VLLLG PTGSG KT L I AKT LAK I LDVPFT I ADATTLTEAGYVGEDVEN I VLRLLQAADYDVA 
RAERG 1 1 Y I DEI DK IG RTTANVS I TRDVSG EGVQQALLK I VEGTTANVPPKGGRKH PNQE 
Y I RVNT ENILF I VGGAFVNLDK 1 1 AKRLGKTT IGFS DDQADLSQKTRDHLLAKVETEDL I 
AFGMI PEFVGRFNC IVNCEELSLDELVAILTEPTNAIVKQYMELFAEENVKLVFKKEALY 
A I AKKAKQAKTGARALGMI LENLLRDLMFE I PSDPTVEAIHIQEDT IAENKAPII IRRTP 
EAIA 

CPn_0847 960019 959367 

clpP-CLP Protease Subunit 

KLFDEETOMTLVPW/EDTGRGERAMDIYSRi.LKDRrVMIGQEITEPIANTVIAQLLFLM 
SEDPKKDIQIFINSPGGYITAGLAIYDTIRFLGCDVhnTCIGQAASMGALLLSAGTKGKR 
HALPHSRMMIHQPSGGIIGTSADIQWAAEILTLjQCHIJ^ILSECTGOPVEXIIEDSERI) 
FFMGAEEAISYGLIDKVVTSAKETNKDTSST 

CPn_0848 961556 960177 

tig/murl-Trigger Factor-pept idyl -prolyl isomerase 

VQASSPAFPFKSNKKGCLVPRSLSNEQFSVDLEESPGCIVSAI ^ VKVSPEVL^^aJWQALK 

KIJ5iQEITLPGFRKGKAPDDVIASRYPTNNn^LGELVTQDAYHALSTVGDRRPLSPKAVR 

SN^TQFDLQEGAK^FSYE^PAISDLPWEM^LPOEEAASEISDSDIEKGLTNIGMFF 

AT]j^WERPSQEGDFISISIi^SKSNDENASSAAIFENKYFTCI^EEE>frDAFKEKFLGIS 

TGTOWET ITSPEIQSFLRGDTLTFTV^VI £VS I PE IDDEK^ 

QLE^AKDKQLOKRFSEAEDAIJ^MLVDFELPTSI^EERISLITREKLLNARLIQYCSDEE 
LERKKS EL I KEAEEDATKALKLLFLTHKI FSDEKLT I SREELQYMMDVCSRERFGQQPPK 
DIS^U3ELVMSARDRLTYSKAIEHVtJlKAEIJ J ASTPSA 

CP|I|849 961752 965285 

mo^r/snf-SWF/SNF family helicase 

ad y-£:i h s ys rgemlnf rklrrdfs an i lqdgkklfeqgav1 daki lsmngetvc i s aqvr 
glydniyecei evdrs esdtvdsncdcsynydcqh ivallfyleqyfnemwayarsadl 
et&Aeineevkkeijcetfvaaatkeeerkd^ 

ekrjsaelavlfvsvneotfapanqpiefolvlrlpcrskpftisnirtflegvlyqepiv 

LNGRRFFFTMQSFNASDRjaiDLLIRYVRYPhlHTTEEKL^ 

qlacrgggslgekesfsglfcgnleeplcwsltpakmkfmi 

ddevqpeotmllesdapgiihhfvyhrfspqikrahlrsfsrlrdiaipealfgsfrena 

lp^eyaeiara^iinsfvtlpyvdevraicdmsyldgeleaklhflygslrvpaasla 

lq?qdvrafisdegilaiwlveerkmleewsgfiydekrx3afrvksekkivefotetip 

an^sitftjcpe^sgqfiydetifelsfregsdinyye^lkvhgi^gvpldi^wdci 

sakkrflelpkagc^skgtrrgkvnsgklpcilvldlekiapvvqifneigfkvlddlvq 

kce^sltgisldqfealpvnfsmserlieiqkqirgeiefdfqdvpqqiqatlrsygte 

gvh^erlrkmhu^ilj^dmgix;ktlqaiiavt^sklekgsgcslivcp^ 

fri#^pefrtlvidgvpsqrrkqltaiadri3vaitsynlu5kdvelyksfrfdyvvldea 

hh i knrttrnaksvkm iqsdhrli ltgtp i ensleelwslfdflmpglls sydrfvgky i j 

RTGtJiijfclGNKADNMV AL KKKVS P F I LRRMK EDVLKDL P PVS E I LYHCHLT ESQ KELYQSY A 
ASaMELSRLVKQEGFERI HIHVLATLTRLKQICCHPAI FAKDAPEPGDSAKYDMLMDL^ 
SSLWSGHKTWFSQYTKMLGIIKKDLESRGIPFWLDGSTKNRLDLVNQFNEDPSLL^ 
L I SLKAGGTGLNLVGADTV I HYDMWWNPAVENQATDRVH R IGQSRSVS S YKLVTLNT I£E 
KILTLQNRKKSLVKKVINSDDEWSKLTWEEVLELLQI 

CPn_0850 965254 966390 

mreB-Rod Shape Protein-Sugar Kinase 
EiSKKYWNCCRYDFMSPHRNLFKLKNFSNRLYNRAljGRFDKWNFFSGNVGIDUjDXhrrLV 
YVRGRG I VLS E P S W AVDAQTH AVLAVG H KAKAMLG KT P RK I MAVR PMKDGV I ADF E I AE 
CMLKAL I KRVTPSRSVFRPRI L I AVPSG ITGVEKRAVEOSALHAGAQEVI LI EEPMAAAI 
GVDLPVH EPAASMI ID ICGGTT E I A 1 1 SLGG I VESRS LR I AGDEFDEC 1 1 NY^RRTYNLM 
IG PRT A EE I K IT IGS A Y PLGDQ ELEM EVRGRDQVAG LP I TKR I NSVE I R EC LAEP I QQ 1 1 
EC VRLTLEKC P P ELS ADLVERCMVLAGGG AL I KGLDKALSKNTGLSVIT APJtPLLAVCLG 
TGKALEHLDQFKKRKGNLV 

CPn_0851 966378 968195 

pckA-Phosphoenolpyruvate Carbojcykinase 

refgi^/mwstnikheglkswidevaklttpkdirlcdgsdteydelg/rlmestgtmirl 
np efh pncf lvrssaddvarveqftf ictsteaeag ptnnwrdpoemrrelhqlfrgcmq 
grtlyivpfcmgpldspfsivgveltdspywcsmkimtrmgddvl/slgtsgkflkclh 
::vgk p ls pg eadvswpcn pksmr i vh fqddssvms fgsg yggnallgkkcvalrlas yma 
kuoowlaekml i ig itnpegkkkyfsasfpsacgkttflamlmpklpgwkiec igddi awi 
rpcirc^rlyawjpeycffgvapgtgertnpnalatcrsnsiftnvaltadgdvwweglte 
oppepltdwi^kpwkpr^spmhpnsrftaplrccpsldpewnarpocvpldaiifggrrs 
l-rr 1 plvyfalljwkhgvtigagmsstttaa rvgoljklrhdpfamlpfccynmayyfqkwl 

^I'AEriR^r.KLr'KlFGVNWFRKNNC^EFLWPrjFCENLPVLEWl/oRTDGLEDrAERTPrGY 
U'M [QKFMI^O.NLDL^VQELF^VDAECWIJ\EVEN IGEYLy IFG.SDCPOQITDELLRIK 

::i-:i.kek 

.■I'riJJH'jl! '*6$214 '1/0^13 

'.TV 1 1 hypor.hcr. ic.i I proto in 
I Kl .ill rOYYYI .rNTVTLOPSY 1NFTPNVTTALSOGK llAl'.M ELUCSALFFQELQDKAQG 
LKIIAUXVyKLiIAKALRPAOVQT:; r^YLPTEESi'Rr'j/SAi ■ [ I DRTMPTFTDDEVKAI LQ 
NI'NFfrTT.'iK [ I VR( ; LUK VF K : ; Y LDS VT P T EG I D PS N P b/*J A L I LNY [TLLNNLKPKFAAGST 
1 Tl jADYNALYAU* ;HFVKF. t EALKAADAPPK:;KVHa/' , /^K [MT [ YNNMQVL3YPVTDYLN 
VU t At Jf .: :f .N rTAA(jF.V0OYl,KNFY. f J I LKt) I LNPiMrM'QATI I Y PADAEYN AR DAGV IQS L 



LNLSCNYROLTENMLPlTrD'^^I lAOIRSjttNtlVNCTI I ASNTLLPTTMRLCTLLGX' 
[YTYQCCAT I FGMSYGT3TP^^^^I DA INQKSYWOARANCFDVTriDOVFDfJFATN IQS 
GTSYRGIDLFKNhflO/NEINPtFL^QAASFLR^PYNLMSRS^C/riEDAANRS ITALDGLI 

sgwstoiatfotqknsldpsllkyfdtmk^Ikesfvttapwmvys 

NDEKTRAMADITRCNKIKAAIDKMLVEIK^AEI^KoOIRELVDTLTNFKSCSDDLIRNL 
SCLL^FLSGLTLKAVNDPriATYEAFTAE/FTEPFN^RQLATFESFVICCGOrCITPGG 
CO0LL0AMESSWDFSTF?I0NCOtALCyES3AMC^EWTLV5AALALLNQW 

CT7l2 hypothetical protein 
N I MH PK I EKRNS LPLTAVAPVF E ESVH PSVATTVDYVDATTL3 RHLTVLKDV I KEARNLD 
LGKAFLTSMKOGFINTGTELAI IQASLADQSSRESRKKEEKIFHQHLGKAAPQAATATSG 
VQPTADPVADKMPLOSAFA-m-L^IPAQEEALYALGREI^LSGYAC^FSPLLDMIKS 
FNSAPINYNLGSYISCTSGTANFAYGYEMILSRYNNEVSQCRI^IASTVKAKAALAf^ 
SVKANVSL.TDAQKKQ I ED 1 1 ASJTTKS LDVI HTQLTDVMTNLAS ITFVPGLNKYDPSYRI V 
GGDLS 1 1 ALOND EXVLVDG KVL5 1 TTAVN EGG LLNF FTTVLT DVQNYGDLAQTQQ LMLD LE 
LKAMQC^SLVSASUCIXNCa^YTTVISGFKN 

CPn_0854 9/2849 971806 

ompB-Outer Membraner Protein B 
GPFD!WSKMIJCHLRIATI£FSMFFGIVSSPAVYALGAGN^ 

CNSYDLFAAIAGSUCFGr/GDYVFSESAHITNVPVITSVTTSGTGTTPTITSTTK^ 
LNNSS I SSSCVFATI AKJETSPAAI PLLDIAFTARVGGLKQYYRLPLNAYRDFTSNPLNA 
ES EVTDGLI EVQSDYQT VWGLS LQKVLWKDGVS FVGVS ADYRHGS S P I NY I IVYNKANPE . 
iyfdatdgnlsykew/as IG I STYLNDYVLPYASVS IGNTSRKAPSDSFTELEKQFTNFK 
FKIRKI TNFDRVNf/fGTTCC I SNNFYYSVEGRWGYQRA I N ITSGLQF 

CPn„0855 / 974001 972994 
gpdA-Glycero/-3-P Dehydrogenase 

GLMKQHIGYLC^rW3FCIJ^LLA^GYPWAWSRWPDLIKQLOEERRKPLAPr^ 

LS FTTDMKEAlTOiAFMIVEGVTS AG I RPVAEQLKQ ITDLSVPFVITSKGIEQNTGLLLSE 

IMLEVU3DSVTPYI£YI£GPSIAKEVIi«Src 

NTDIKGAALOGAUWIAIACGIAEGLSFGNJJAKAGLVTRGLHEMRKIAA 
GLAGLGDLGvTCFSESSRNIJlFGHIXACX3LTFE0AKAKIGMV^^ 
I DMP ITTG> YRVLYENLDLKEG I ALLLQRNTKEEFL 

CPn_08S6 975410 973995 

AgX-1 Homolog-UDP-Glucose Pyrophosphorylase 

GSRDRIA^TVhrrESVYSPSAMHVNSrADKLi<AINQEH I LDIWPSLSPKQQQRLFQCjLTS 
VDID0tacXX3LLSSPTAIUOF14PITSFASSGEDPEllAHAGTTIXKEK 
GSRLKCDGPKGLFPVSPIKKKPI^QLVAEKVRAASKLAGQPLPLAFffTSPLNTROT 
ESNWHU)P^VDFFCQPLWPIXTLSGDLFLEDMDTLAIX3PNGNGCIATLLYTSGV^ 
GIEMVSVIPIDNPIALPFDVELCGFHAMSNNEVTIKAALRCTAIEEMGILVKSHDS 
G(^SVIEYSEIPQNERFAIJ«IEDGKLKYCIANIGLYCLSMDFIRHAAYC^LPLYKVHKHAK 
GHTSLNEKNAWKFEEFIFDLFCYSDHCCTLVYPRQECFAPLKN1,EGNHSPDTVRQALS 
kERQLFHKVTGKKLSPNTTFELEADFYYPSTSTSLHWENKAFFEEPFFEAS 

,0857 975808 975392 

rCT7T6 hypothetical protein 

^RQYI KTARG ISRLMRDRLGSLS L I LKVK I HKYI^U^OKRLALTVSRNIQATNKR 
I^LHLERraFISRONIKHTOILLEYLJCTLQSSLYKQQSESLRFL 
mEKIKNNKYSKDQEIGT 

^CPn_0858 977115 975757 

flil-Flagellum-specific ATP Synthase 

RNSETRNQRJITRPSTFCFDSMNHI^IKEKLHIHNWQPYRACGLLSKVSGNLIEVDGLSACL 
GELCKISSTKDPNLLAEVIGFHTJHTTliMSLSPt^SVAJLGTEVLPLRRPPSLHLSDHL^ 
RVIJ3AFGNPIDKKEDLPKTHRKPLLSLPPSPMMRQPIDOIFPTGIKAIDAFLTLGKGORI 
GVFSEPGSGKSSLLSAIALGSKSTINVIALIGERGREVREYIEKHSNALKOORTI I IAAP 
AHETAPTKVIAGRAA^IAEYFREQGHEVLFIMDSLSRWIAALQEVALARGETLSAHQYA 
ASVFHHVSEFTERAGNNDKGSITALYAILYYPKHPDIFTDYLKSLLDGHFFLTSQGKALA 
SPPIDILSSLSRSAQALALPHHYAAAERLRSLLKVYNEALDI IHLGAYTPGQDEELDKAV 
KLLPS I KAFLAQ PLSSYCYLDNTLKQLEALADS 

CPn_0859 977597 977055 

CT718 hypothetical protein 

VFLVTTPQSPGSLSOSHLPHPHDPWDTEPTSLPEDPNDKASOELHSLVHLFRKLSIHLLS 
EVEKWG^LKPDLLELALLICEKFLYKKLENPQELALLLSTAJWRHTTLRSLTPIKVFLH 
PEDLKTLTDWI STHELPWI KHAEFFPDTSCRRSGFK I ET PNG I LROEI SEELDHLLSVLT 
A 

CPn_0860 97S639 977608 

f liF-Flagellar M-Ring Protein 

RTLVFFONLAKKLTALGISFLGCLLIGGWSCAILFGRSSNPSLAPTQVKTEKTSGNWLK 
LTOMGNPKL I ESLTKKECLEKDLTSFHP I ASAKVAI ALSTEDDVMS PLHLSVILTLRKEE 
SLTPSLLFS ITDYLCSSLF GLKREH I SL3DNLGHLY IPES ITVNSLFI HTLENYLGK I FP 
KEHFALAYHAKAEKPTLCLTLNENYIAHLTKEESEKIVAHTKHYLYQNYDDSYDIVIETL 
PFARLQNKKSFPAKVLIC^MILVISLMr/ALASrrLARHAYERVSPEPRKIKRGINISKL 
LE I IQKES P EK I AL I LS YL ? PKKAEALLNR L P EDLKH OVLK YK L 

CPn_0861 ^"9752 f >78925 

nifU-NitU-relatetl p:otein 

AS Y PFTWKFLMTLPLE PM I FWSS LSAIC/MKR F LT PHC ACTFS EEDAEAK EAHLVTGKQGH 
RLMGNC'yTFYWLVDKKNGV I LDAKFQYFGHPYL I PLAEAVCNLVCGKSYSEAYKMTLDDI 
DKSLRVHAHQPALPEDSI^LYUFVIDALDTAVEQCLEIPLEDGSLPLONSPMNLDFEDAN 
PYSQSDWEALTH EQKLYALRAT I AEK IGPY I AMDGGEVTVESLENF IVT IAYSGNCSGCP 
3SLG3TLNSIC0LLRAY IYFELQVKVDE35LNL3HP 

CPn_0862 *»H0-*24 VV)1?.2 

y thO-Mi£S- rr- L.iti^l p : or. ^ i n 

GRGTIFRITCGKTl'L* [:'MEK['QfIRKAPP rFWLriflQVAI r'f J 3ERVKE:*>YAL.Ifl*D [I'ijLPPG 
3ALKIAEKTEEC IRV?L\AiLKL):;H I FRF'/PHFI 'H'/VH LVI.AALVFJML:'MF'^ IRNH 1 1 LPAH 
DQQLL I N3LCRHOGU ITTY I WTVNH E' 'M t VEF/jL [ ETL: I PR:' I .LFijLiIAAH( :[.T(*V IQP 
LDPLLCLCKDRR [LUU.P!;*[J[Lf iRAf'LTPI-! I UIAD t tTF33AALG< W>'.'> K*C IV TRKSL 
ERVFSCWFPPHTSAi'UTr'AVAAMOTAf.FF.R I ::ALPr,FTFHT::NrA'KKL IQEUJSVLP:* t 

OLaf:;evqnrlpniwaai PDn>AE::ty..FHLn^ [yp:;u:yerfopl-\uvloni i:;pf 

I/rH:;ALHFr.LTER:;KUl.E5-::K[AHAMMIwr»IKIIL.T[ J LLt:::. , ;3 
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EHMALL [ LLPHCOSVWNEKNLF.TGWVDI PL3QQG £ EE^^BaIQNLP r DC I FTSTLVR 
iLMTALLAMTOHHSKK I PY I VHEDPKAKEMSR I YSAE^HJ PLYQS3ALNERMYGELQ 
GKNKKOTAEOF^EERVKLWRRSYKTAPPQGESLYDTKOlWffYFEKNILPQLQNGKNVFV 
SAHCNSLRSLIMDLEKLSEEEVLSLELPTGKPWYOWKNHKIEKKPEFFG 

CPn 0864 9131658 982374 

vibC- predicted pceudouridine synthase 

YGVNVTKVRUIKFLASACVASRRKCDEI I FSCSVTVNCRVAEGPFVLVDPEIJKVQVCCTS 

7 q:-- K vvy KM 7;i-.-.\:';7:. •. W'.VYY KI « TKL7 1 '. « .F AH LFY RYFTVGP Z.?j¥. !i"P:J' JI . I LVTN 

jv;rfank < r m ; : . : . : rr^: j.K7::m*/::akdi , ;f :.msttf re :kiivp*v::vtk irp'TTVk 

I WSEGKKHE IRLFADAAGFP I LELKRI RIGSLVLGGLRYGEYRELTDAELGTYMKLSD 
CPnJ)865 982412 982942 

CT865 hypothetical protein _ 
SPMGYVFYVIAGSIFUSISUIAYCQLYYSW^ 

EDAOSQ KE I DFLSQC DKLSWRAFLKNSYEI I PTFKEMEDLLS ERVQGFLES I ET IAEHDR 
AILCIENFWASKNLFDFEIAAYEEAVEKYDCLRQRAPLRLASKLFRFLDVPSIRFSS 

CPn_0866 983494 982916 

NMKVI YYEI EEI PSTN^^SYMHLWDPYALTVISTKCCTAGTGKFGKSWKSSKGDL^ 
FCFFITDLHI DVSRLFRLGTEA WALCKDLG I TEAK I KWPNDVLVHGEKLCGVLPETLPV 
EGLLGVVLGIGIJ^KjNTTKQALKDVGQPATSI^EIUjHPIDLETTRELI^IHHLI^^ 
PDSLATKSNRGNI 

CPn_0867 983405 984667 

rodA-Rod Shape Protein „..— ,„, 

C I RI PQMHIGFCHCVRGGNFFYFVINNFH ILE I YSLLNSNT IMRYHKYFRYVNSWVFLW 
LTLMIXSWVISSMDPTAMLVTSSKGIXTNKSIM0LRHFAIX3WWFFICAYFDYHLFKFW 
AWVL YF FM I C ALVGLF FVPSVQNVHRWYR I PF IHMSVQPSEYGKLVIVIMLSYILESRKA 
D I TS KTT AF LAC L WALPF FL I LKE PDLGTAL VLC PVTLT I FYLSNVHS LLVKFCTWAT 
IG I IGSLLI FSG IVSHQKVKPYALKVIKEYQYERLS PSNHHQRASL IS IGLGG IRGRGWK 
TGEFAGRGWLPYGYTDSWSALGEEFGLLGl^FTLGLFYCLICFGCRTVAVATDDFGKLL 
AAGITWIAMHVLINISMMCGLLPITGVTLILISYGGSSVISTMASL^ 
Y 

CPn_0868 986733 984670 

zntA/cadA-Metal Transport P-type ATPase 

NFRNGLGVRDLHHFREYYLI INEI I ITGRYVFSRLFFTSFSAEWNTFFESGMSEDTSPL 
LSKQNRKLSHNLPLKSAYLSLGTYLIALLSFWIJiAKNLSNLFVVFTFFlACT 
NICQKWNI DILMTSAAFG S I F IGGALEGALLLVLFAISEALGQMVSGKAKSTLVSLKQL 
APTTGWLV^ETCNI^KVAINKIEV^ 

KSGHPGSIVPAGAHhWEGSFDLRVIJ?TGSDSTIAHII^VIQAQNSKPRI^ORLDKYSSV 
YAIiS;I FA I ACG I ALLVPLFTS I PLLGPQSAFYRALAFLIAASPCALI IAIPIAYLSAINA 
CAl&GvI^KGGVILDRLVSCNSVVMDKTGTU 

SSSftPIAEAIVSYLMEQKVSSLPADRYLTVTCEGVRGYH'IEQEAFVGRVETCI/j 
LEDTEQK IYQAKQHGE ICSLAYVGNSFALFYFRDI PRPQAKEI IQDLKDLGYPVSMLTGD 
HKV5 A£NT AE I LG I SEVFFDLT PEDKLAK I RELATQRQ IMMVGDG I NDAP ALAQ ATVG I A 
MGlAGSATAIEAADIVLLHDSLSSLPWI IQKAKGTKKWSQNLALALAI I LLVSWPASLG 
I igjWLAVI LHEGSTVI VGLNALRLLKS 

CEfe-0869 987479 986658 

CT22-S hypothetical protein 

EG^FFPKTSEOTSDCRQHQIUlKIwrQDPHDHFKSRTPEDHIKHVRDKHRVCKGEPHT 

tfjksffyhlannalstgvfiff irtlffliptnralqvkslislgvgwtfyhgclkarka 
wayI^lshrsmleekneieenfeqekielrilfeng^fkdplloemveyvcsdstlli^ 
mi reelyi rkedlphpliqggs rilgglcglai flplvlc i sytlagvfsalmvlvlsfl 
kakilkndkisemvwvlgifitsasiisslmkll 

CPfl3)870 986881 987448 

sel^-Seryl tRNA Synthetase-2 

TTf HPTQGFGGAVI LPFSP I S I ARRI KKSCCSEKSS IYSHFCTLLLNNETSMLDI K I IR 
TPEfiCETRLRKKDPKI SLEPVLSLDKEVRQLKTDSETLQAQRRLLSQDI HKAKTQGVDA 
NLlQEVETLAADLEKI EQHLDQKNAQLHELLSHLPNYPADDI PVSEDKAGNQVIKSVGgL 
PIPSFPPKHHLEIJ^0ELDILDFQAAAKTTGSGWPAYKNRGVLLEWALLTYMU2K0AAHGF 
QLWL.PPLLVKKE I LFGSGQ I PKFDGQYYRVEDG EQYLYL I PT AEWLNG FRSQDI LTyEKE 
L P^QfAACT PCFRREAGAAGAQ ERG LVRVHQFHKVEMFAFTT PNQDDIAYEKMLS I^EEM 
LTEL^PYRLSLLSTGI^SFTASKTIDAEVWLPGQKAFYEVSSISG^DFQSRRS^TRYK 
DSC^LQFVHTL^SGLATPPiLVAILENNC^AICSWIPEVLRPYLGGL 

CPn_0871 988766 989899 

ribD-Ribof lavin Deaminase 

eymedfseqolffmrraieigekgritappnpwvgcvwqenriigegfhaVaggphaee 
laiqnasmpisgsdvwslepcshfgscppcanllikhkvsrvfvalvdpotkvagqgia 

MLRQAGIQVYVG IGESEAQASLQPYLYQRTHNFPWT ilksaasvdgqvadsqgksqwitc 
peiarhdvgklraesoailvgsrtvlsddpwltarqpggmlypkqplrv\^,dsrgsvppts 
kvfdktsptlyvttercpenyikvldsldvpvlltestpsgvouikvy/ylaqkkilqvl 
veggttlhtsllkerfvnslvlysgpmilgdqkrplvgvlgnllesa/pltlkssqilgn 

3LKWWEISPQVFEPIRN 

CPn_0372* ,989903 991216 

ribA&ribB-GTP Cyclohydratase 4 DHBP Synthase/ 
KER I FRVAC LAS ESVNARESM I ETREEVG S ANFVS LEPA I EDLHRGKFVI WDEAS REDE 
GDLI IAGEK rTVE^FLLQHTTGWCAALSQERLLSLDLPPMVKDNRCRFKTPFTVSVD 
AAHG VTTGVSAADRTKWQLLADPKS K PEDF I S PGH FF PLASfi PGGVLKRAGHTESTVDL 
MELAGLQ PCGVLAELVN EDYSMMRLPQ ILE FARKHN I AVI P/TS 1 1 AH RMLS DRLVSK I S 
SAPXPTIYGDFTIHWESLLEGMQHLALVKGNVAjGKSNVLWVHSECVrGDILGSKRCDC 
GEQLS5AMS Y I AEKCTCVLVYLRCQECRG IGLGHKVPAYALQDNGYDTVDANLAMGFPVD 
3REYCIGAQILVDLKLTTIKLITHNPQKYFGLCGFGL3I#ERVPLPVRISEDNEQYLRTK 
OEPMGHWLDLPiXNNRVQ 

f.PnjytT, '.»'.» I LHH 'I'MWl 

r LLF Ri.bi.ty I lumaj ine :>ynth.ir,^ 
I.vHAVT I' JYNNFEEYMKTLKGHLSAKNLR I AlVGJ^FtJQAMADALV^.'TQETFLKFCjGijE 

[/ ;lmt i p.vr-i :afe i pot r kkll:.:::kkkfda ivacovl iogltuiiyno fvnovaag k :alj 

LEKCLP tTL:' I V AA P. 1 * A E I AWO R 3C I KGR H LG V ^GMTT A I EMA' r L FTO I 

'TV: ; hyporh<.-r um I protein / 
\A :;LW,K [LTKOR[>REFA::MLK LLK I KVLVFPLALLMGGN.'; [f;YAt;[\V::LfJTN:;OTKVK 
t<",'.'J\/VI f ll'JKt-RUYPril.LWLTEnntlAPLL'DTlTP [DMAY.^EKLFNKKVPALIjE AIRoM [HL 
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HLLIO(:SR03-MQLGOrLP^«^FKCF7rAll^LLFFLN::PK^FDNTL.R:LETA[VL 
^L^SpS^ 

eSv^tS 

LRSHREMLNKQLLPQCTVLDFSETTLSSGOLpVFAES r AVR IHLNGA^ o INL 



CPn_0R75 



1 r 03h3 



RL3RRSRRLFAP.P. LfcTUKbTLg VgAMFjfrY AEK I J EgDERDL •J t*VySSAAEKSo I ^LALs 
QGE I KDALYR I R EVH PLAL I EALAENBAL I EGMKKMQGRDW I WNLF LTQLS EVFSQAWSQ 
GV I SEED I AAFASTLC LDSGTVAS I ^OGERWPELVD I V IT 

CPn.0876 994123/ 995517 

dagA-D-Alanine/Glycine permease 

siatgetmlyfieqlnklstsfoCtpmilllggfltoklrglqfhgljcl^ 
ds s skanevs sy eavag i lagnfgtgn i agmavalacgg pgalvwvwlaallga ivqyag 
sylgskyrkpeentgefiggp/aclafgmrkkilagffalft imtafcagncvqvsc ivp 
lcaectpgkllvg i llalwipvlaggnnr i lrfsarv i pf i agfyc i scg 1 1 lfqhasa 

ILPAIKLICSSAFGIKAGLAfclGGYTLSQVISTG INRAVMATDCGSGMVS I LOANTKSKN 
pVVDGLVTLVPPVIVMWOSITMLVLIVSGAYSSGAfXrTL^SAFKNSL^ 
AMALFGYTT ILTWFAC AEKSLQYM I PGRRANLWLKA IYVLII PLGGV I DMRM IWALSDTG 
FSGMVILNC IALIALLKf VLSTNRDVALLKERECSVADPVRNLDA 

CPn_0877 / 995521 995982 

ybcL family . 

RRR I MOLLS PAF AYGAP I PKKYTCQGAG ISP PLTFVDVPG AAQ S LAL I VED P DVPKE IRS 
DGLWIHWIVYNLS^ITNIAEGAEIFAVG^LOTSGKPVYEGPCPFDKOHRYFFTLFALDV 

VLPEEENVTRDQ^EAMEFH 1 1 EQ AELMGTYEKS 

CPn_0878 / 996660 995992 

SET Domain/protein «,„, r ^ 
GCMSTVTTE&SSIHISIJW>^SQPYS 

HKSEKRRL^PLAKWLGKLHKQDIXCPPAPPVSVCWINAHVGYGVFARX>EIAP^ 
TG I LRHRQrIWMDENDYCFRYPMPLFTLRYFT I DSGKQGNVTRF I NHSEOPNAEA IGVFS 
EGLFHVI/RTVAP IYAGQEICYHYGPLYWKHRKKREEFI PEEE 

CPn_08/79 997463 996645 

yycJ-Aetal dependent hydrolase 

YRILS^SMOGFFPLASGSKC^SAYI^DSCKILIDLGVSKQVVTREU.SMNIDPEDIOA 
I FVTHEHSDHI SG I KSFVKAYNTPIVCNLETARALCHLLDSHPEFK I FSTGSSFCFQDLE 
VOTFWPHDAVDPVAFIFHYT^EKLGFCTDLGWVTSWITHELYTCDYLLIEShmS 
Q3QRPDVYKKRVLSKLGHISNQECGQLLQKI ITPKLKKLYLAhiLSTECNTAELALSTVSE 
I AS ITSI APEIALAQG ITSPIYFSRLEVACPR 

/CPn_0880 . 999864 997444 
ftsK-Cell Division Protein FtsK 

pmirel^srhprlptlplaajcaslyixfacfsglslwsfhrdqpctqwigllgwsfss 
ellyffgaaaffiplyflwlsflyfrrtprplffykaaaflslpfcsaillsmlspvgtl 
p\lldtrlpkfidc^ppvswggipfylfyegosfclkhligsvgtalifgfvmlfsvl 

3IAIXKKKTFQIXmKAFCSFFC^FKNLKKLI^RNYLPKPSVPFVSKNPFSCTK 
JpSPRRVSETI ILDGSISPLPQEEI PGSKKESFFLTPHPCKRFLTKFVEPQENKAKEGK 
,TIALSSTPTVVRESKGKERAALPKIJ<SL^VPE^roLPQYHLLSKKRElARPESLOA£^ 
LILKDTLTSFGIDADIX^ICSGPTIAAFEVLPHSGVKVQKIKSLENDIALKLQASSIRII 
APIPGKAAVGIEIPTPFPQAVNFFOLLEDYQKTNRKLQIPLLIXIKKA^DNLWADI^TMP 
HL I IAGTTGSGKSVC INT I VMSMIMTTLPSEI KLVI I DPKKVELTGYSQLPHMLS PVITE 
SREVYNALVWLVKEMESRYE I LRYLGLRNIQAFNSRTRNKT I EASYDRE I RETMPFMVG I 
IDELSDLLLSSSQDIETPIIRLAQMARAVGIHLILATQRPSREVITGLIKANFPSRISFK 
VSNKVNSQIIIDEPGAErOOJGDMLVI-LPSVFXn'IRAOGAYICDEDINKVIODLCSRFP 
TQWIPSFHAFDDSDSDNSGEKDPLFAQAKTLILC/TGNASTTFLQRKLKIGYARAASLID 
QLEEARIIGPSEGAKPRQILIQNPLEG 

CPn_0881 1005646 1006209 

No robust homolog present in Genebank/EMBL as of 11/7/98 

NKKFAVHMPVPIDNSSRNLQEVPESLEDLEQHAEESPTHQSAESSSLQLSLASSAISSRV 

EQLSSLVLGMENSDFSSLRDVPIFSAIYESSTHTPVPTPLVGVGYINGSQSGYYDTQRES 

LHLSQLLGSRRVEVVYNOGNFWEASLLNLCPRRPRRDPSPISLALLELWEAFFLEHPPGS 

TFNPIFFW 

CPn_0882 1006169 1007404 

No robust homolog present in Genebank/EMBL as of 11/7/98 
NTPQVALLIQYFFGNGAFYVREALRLTPHAONIVLVGICPSLYPEHPRSFYYRVSGDIGS 
R FDDRG FVNSGVETLPYS SGSFG I FW 1 3 FTDPTFNFAI VNTFMRT AG INEVS RPMTQDTE 
TS LI EMRDLSEOOEANNTDSLEQEESLMG I VGHTVGGVSMTVTSS PN I FYRIQTLLGLPE 
TLAEAEENPTFPNSTIDSLAEIMMNLVRISDAVS IFWI FPIVDTTYNGVLLAVCIGFFG I 
^ICSTFLMLTNPRSRRDRV^LRI^CYRSLGSGMNLFDLSrJNVRhlAARRH^/rSCrVA 
LYANIVTLFGVm/AIODALOYGFPSVRDAFYRYCLRHRYCLTQRNEDSLOTTGTRFQVTRT 
HLEDQQMVAS ILNLSVFGLFFGFVGLMTTFGGLEISPSCRWDAANNRTVG I F 

CPn_0883 . 1008904 1007573 

dmpP/nqr 6 -Phenol hydro lase/NADH ubiquinone oxidoreductase 
LYELF I KSG I F IVMTWLSCLYF IC I ASL I FCA IGVI LAGV I LL3RKLF I KVHPCKLK IND 
NEELTKTVESG0TLLV3LLSSG I P I PS PCCGKATCKOCKVRWKNADEPLETDRST FSKR 
QLEEGWRLSCOCKVQHDMSLEIEERYUIASSWEGTVISNDNVATFIKELWAVDPNKPIP 
FKPGGYLQITVPSYKTNS3DWKGTMAPEYYSDWEHFHLFD0VIDNS0LPADSANKAYSLA 
3YPAELPTIKFNIRIATPPFINGKPN3EIPWGVC3SYVFSLKPGDKITVSGPYGESFMKD 
DDRPLIFL[GGAG^FGR3H[LDLLLNKHSKREIDLWYOARSLKENIYQEEYEMLERQFP 
N F H Y H L VLit E P L P ED [ /VAGWDK DD PT KTN F L F RA FN LGO Lf; R L DN P ED Y L Y V/CG P P LH N 

[lkllcdyc;ver: , .s r t lddfgs 

CPfi_'JHH4 liH}'M»,K IO()'M)f)'J 

CT74 1 hyptirluT ici I pmr.oin 

( ;[X;MLr;R [ VTCFI JXL: I: :LI > LFAEEEA/vg::KNTF70l 'AVM! ,A f A I E J- l-'Y F [ T M't'EOKRR 

kami-:kpkndlakodkvtami; i rcrrvDM nrarrv i e,n i A:;f -.kvi-ivlk' :a i :;ei [.kpnonk:; 

i:r*n_0HH r > liMOt.'iO I'Kl'M i : 

yqcA-rKNA M».>r liy 1 r. t , ins t < t : j ■ 

ASlt.TM.'n-MONCrnKCVci u i.»:;M7.';!>:;i.KKKEKt..l.Mfjl.KAl'I.VI'::HMI AI'C \\\'S,\> 
:JI,R';PNKMFFSFFOTYF/ IKK:'! / W I : ITKl'KK''; I f'VTT'.'LL. rill-lO'I'MI) [ LKi.Tfd-WWlJKM 

PEiJ4AYFPi , KNK( , .::i;."rr;iwrf v.'[iFMvri;n*::« rn-F.YiiVNivV totMKK 1 1. 1 ;i. 
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^ke^p/yledakafckrn^kapdvilidpprcgI^lkvilrigspkivyisc 

NPKTOFOEC ADL I SGGYR IKKMQPI DQFPYSTHLENI I LLERE I OP 

CPn_0886 10U288 1010903 

h<~r A-Hiqrone-Like Developmental Protein 

^LFM^LKOTAKKMKDrXDSIOHDLAKAEKCNKAAAORVRT 

i Ki'r" Tr Al Ai'VK KTA KK K A i "K i ' . \:AAAAr:T.:KA7KA::K}-A:l KTA,-\!\K »- r r.» ., 



CPn 0887 1011692 1014157 

^DYKKA^LAK ELVAAX-EKGSC SPHPEIVQI EKTFLQKTLLALQI KVAQEAQESC DALL 

££c^ ayS^ 

HCAYLESTLLYYAY^ 

akgctesaekwlqi^ 

ahlsnhllehveknlisprsyrdyygeslqrtlglcqrfi^v 

CPn 0888 1015441 1014119 

^fl^otucliheis^ 

SLI^F^POTQDSSVQDFLKRHSSQN^ 

REA^G^RSY^SPKKSKTDRYI^t^PSMGTLITTIQEKLPATVKFSTSVTHIDC 
FSLPKGYGMLFADELPLLGIVWNSQIFPQATPGKTVLSLLIEGKWRESEAHA^AIAALSE 

ylninqkpdafalfssqdgmpohavgflerxerilphlpgnlkivgoniagpgl^cias 
ayhaicdlhteetlaqpqssl 

CPn_0889 1016841 1015462 

hejaN-Coproporphyrinogen III Oxidase 

fIu^fkfleglto^ 

CO SMCLYCGCSWLNRREDI VEAY INTL I QEMKLWET IGFRPQVSRIH FGGGTPSRLSR 
ELWIXFDHIHKLFDLSHAEEIAIEVDPRSLRNIMEKADFFQ 

QE/SVRRROSHEESLKAYEKFKELAFQSINIDLIYGLPKQTKESFSKTIQDILAMYPDRLA 
LF^FASVPWIKPHQKAMKASDMPSMEEKFAIYSQSRHIXTKAGYQAIGMDHFSLPHDPLT 

Ukknktlirnf^yslppeedli^i^otstsfirgiyujnajctlefawvi^ 

K^f LTEDDRIRKWAIHKL>1CTFTINKEEFFNLFGYEFDTYFIESRDRLISMETTGLIHN | 
SEfgLKVTPLGELFVRVIATAFDHYFLNKVSKKECFSASI 

CEfe0890 1017829 1016819 

heraE-Uroporphyrinogen Decarboxylase 
ST^toSMSAFFDLLKSOTASHPPIWLLRQVGRYMPPYQELKGSQSLKTFFHNTEAIVE 
ATDHSPSliHVDAAILFADILSILIXSFAVTYDFAPGPRIQFSPEQPFTFTSDPO/riFSY^y 
LI^fcLKQKLPVPLI VF AAS PFTLACYL I DGGASKDFS KTMSFLYVYPEKFDQLISTr^- 
EGTAI YLKTQMDAGAAAVQLFESSSLRLPSALFTRYVTE PNRRLI AKLKEQAI PVSLFJ 
CFtENFYTLQATQADTLHPDYHVDUiRIOKNI^SLQGNLDPAIFLLPQEKLLHYVEAfL 
VPpURTV PNF I FNSGHG I LPET PLENVQLWS YVQRQL 

Cpy0891 1021079 1017819 

m£H-Transcript ion-Repair Coupling / 

NFl^MDFNPVNLDFSISKEFKEETLPLIiENIHPGATAFtAA^ 

LDPI.FEl^TFLDQAPVEFPSSEIDLSPKLVNIDAVGKRDHLLYSI^QHRApy'CVrTLK 

aiBktrspqatsqqhldlavgdvi^peattelcksi^sqvmltsekgefszrggivdi 

FPLSSPEPFRIEFWGEKI ISIRSYNPSDQLSTGKVSKIS ISPAYTEEASGGtfYSHSLLDY 

fs'&lylfdnleileddfadisgtlsslpdrffsigtlydristsnqvybSetpfpnvk 
nlksnrvi I eafhrnmeasrqai pilypeqi iqndenpllaflqhlqeympphgkplkla 
iy^ktkslkearaiaetvargdveiyektgnltssfalvneafaaisi^efastkvlrr 
qkqrthfsvtteevfvp I PGETWH I hng igkflg I ekk pnhln I etd/lvleyadkarl 
yvpsnqaylisryvgtsdkaadlhhlnsskwkrsrdltekslivyae^ujleaqrsttp 

AFVYPPHGESV I KFAETF PYEET PDQLKT I DQ I YNDMMS PKLMDRL BCGDAGFGKTEVIM 

raavkavcixhrqvivkvpttilatqhyetfkermaglpieiavlsrfsoakvqkliceq 

VASGQ I D 1 1 IGTHKLINKS LEFKN PGLL 1 1 DEECRFGVKVKDNLKjERY PM IDCLTVSAT P 
IPRTLHHSLSGARDLSVIAMPPLDRLPVSTFVMEiihrrETLTAAlJtHELLRGGOAYVIHNR 
t ES IYTLAET I RNL I P EAR IGVAHGOMGAEDLSN I FTKFKNQK7D I LVATAL IENGIDIP 
NANTIL I DHADKFGMADLYQKKGRVGRWNKKAYCYFLVPHLDBlSGPAAKRLAALNKOEY 
GGGMK I ALH DLE I RGAGN I LGT DOSGH IGT IGFNLYCKLLKMAVS ALKKHTS PLLFNDDV 
KIEFPYNSRIPDTYIETGGMRIEFVQKICNAES5EELTAIQEEMRDRFGPLPOEICWLFA 
LAEI RLFALOHG I SS I KGTANALWOKCLSKSEOTKKTLPJf ALSPTPELLVKET/I ES I ER 
CFLINAS 

CPn_0892 1023673 1021046 

alaS-Alanyl tRNA Synthetase 
EFFFMLSNTIRSNFLKFYANRHHTILPSSPVFPHNOPSrLFTNAGMNQFKDIFLNKEKVS 

ysrattsokciracgkhndldnvchtsrhltffemzgnfsfgdyfkaeaiafawevslsv 
fnfnpegiyawhekddeafalweaylptdrifren'dkdnfwsmantgpcgycsellfdr 
g ps fgn a3 s plddtdg er fleywnlvfme fnrtdtc gllal pnkhvdtg aglerlvsl i a 
otmt/feadvlreliakteqlsgkwhpddsgaAfrviadhvrslsfaiadgllpgnter 

GYVLRKILRRSVrWGRRLCJFRNPFLAEIVPiJl/pAMGEAYPELKNSLSQIQKVLTLEEES 
KFKTLDR<XJNLLQQVLK. r >:;SGSSC [GGEDAf/lKDTYGMPIDEUiLLAKDYDYGVDMDTF 
MKLEQEAKERS^KrAA/QSOGT^E:irYNELH/T^EFIGYDHLSCr?rFrEAIir:KDHIVSSL 
0 EKO EGA I VLK V: I PFYAEKGf IQVGDGG EISCG EOTF [VTHTTI'PKAGLI VH! KIR I SOGS L 
TV GAAVTAOVNR Y R R K R [ ANN! ITACH LL1/KALE ITLODH I ROAGSYVDDTK I P.LDFTHPQ 
A I SPEOLU: I ETLVNF.iJ I RENErVDIP.E/LYSPVMNGGE [ KOFFC.DKYSDVVP.TVSAUHS 
IIKU -•r/ITIIAEATf ID UIFFR [TKRHAVaA; tRR [ EAVTGt-'KAKA'fVi IQQG EVLEE I ATLLQ 
VF'KUj [ VSRI.TATt .DERKQQDKRLIJe/eN^L I OTKLDKL 1 1 1NCI igROG LTCL7HHLAEHE 
Nin<I/>jYAO»:^IMUin-:KLt^Lvm/KNt;KYIVL:-RVf;nntJT(.X^IIAODLLKAVLTPCG 

f ; hwr -f ;k do: w >• -a pal l 'A 1 r KVi .n otlwcw i rrro l l 



ppiAFTl^r-YSCCFYtEC^^tNKE^ELGKIAGAIKQrGIESIOKAGSGHPGL 

CTKOFWHL^E^FFV^ 

LXKQVRWGFPCWELF^W^ 
RFGYSGASDDVSEECGFTTEQ iiwrilso 

CPn_0894 10J&823 1025888 

PWTO^A^LRRK^KGER^KHTSESR I AQDMLERYSGSSVKQFC PYLLLTNFS YY IQT 

GLRSHYQVGDYFVPVAS/RGECrrSDAYFPPEVPAWFVVQKATTEV^ 
HTTNIRF^FNKKFRJ^YETKAQSAEHE^ 
CT^SGNFIITn^E^LTGQEVIENLEK^ 
MASGSETSDSDY V 

CPn_0895 / 1026973 1027557 

E I DCFMVRVS^SfcFRVGLR I EI DGQPYL I LQNDFVKPGKGQAFNR I KVTCNFLTG jW I ERT 
YXSGESVETAMVERSMRLLYTDQEXjATFT©DETFEOEWFWEKLENIRQWL 
VLYNGDWAVEP P I FMELS I AETAPGVRGDTASGRVLKPAVTNTGAKIMV'P I F I DEC ELV 
KVDTRTGSYESRVSK 

CPn_089/ 1027574 1027822 

E^FFFJ&RNMEA^ I PESIREIE 
KEERVTrrPQLFQAIAEKILEEGV 



0897 



1028794 1027853 



::pfi_')H-i \ 

\ kt P. -ft.ini-.kti 



nSJSsStvwknksnprphoekpf^^ 

F^VHFQATTIGQRFPKVVRSt/*ADSVCITGDFSLTAMDGEFLLAK^^ 
IffGIfflDVYTLKSLAG^TFVrHFPNDQLQQNKVSFHKIT 

hlSS^etfiISspeenviianhypllssqot 

f HGHEWQAAVYNC ADTS PSY I LNSG S I SLPTNSRFHVI DLYPEKYQVHTM I LKNLLDFDAP 
LE I ANEATWDCQKL 

CPTU0898 1030511 1028904 

Mitochondrial HSP60 Chaperonin Homolog ^^^r,,, 
TKKRLG SVK I LRLLGVCMS EC EKLSNYNADKXLF SG I DKLFQ I VXG SYG P KQS LS PT S F F 
KERGFYAI SOTEI^NSYENIX^TOFAKAMVNKI HKEHSDGATTGLILLH^ 
GISTHKLIASLKI^EKLQEALCXXJSWPIKDAIJ^ 

pegliStkeS^smdvfqgfkipagyas^ 

IHSLiPLLQEISEQNQHLIircEI)IDPDVIATLVVNKLQGLI^VTVVriP^ 

miAiFTO-HICPCQEASHVIAPEWrUJSCLSIEISESCTrLIGGIJilPE^ 

A£EIRTTSCLETKKRLIKSTNRIJ3SSVAILPTDE1DNEPLYTI^IJ<IMESALSRGYVPGGG 

VALFYASLTLGTPKDDADENSIAISLLQKACCAPLKLIATNADLKSDAVIAKI^SLOTTS 

IjGISVFSREIEDLIAGGILDSLATTSTILAQALDTAILVLSSKILILENQYEISTL 

CPn_0899 1030848 1032215 

. l^RCCRQNYMR^^ 
GHOFUCHAATAGAVAAWSHDYCGDSFX3LELIRVDDTKSAWEAGSNCCNLFOT 
GSVGKTTTKEFSKTILSSIYKTHASPKSYNSOLTVPLSLLMAEGDEDVMILEMGVSEPGN 
MQDLIAIVQPEIAVITHITOHAMHFPQGIQEIUCEKSYILOKSKLQIA 
SCSPTAEKFSFSFNDPLADFCYKAISGDSWIGTPEENYCLPIAFSYKPAYTNLLIAVAL 
SWILEVPEEGVIRSLPELKLPPMRFEHSMRNGMQVIhTOAYNACPEAMIAALDALPLPSDG 
GK 1 1 LI LGHMAELGRYS EEGHALVAEKAASRGDM I FF IGEKW I PVQSVLKSYSC EVSFFS 
SAQDVKDI LKQVARYGDVI LLKGSRALALESLLACF 

CPn 0900 1032208 1033281 

mraY-Muramoyl-Pentapeptide Transferase 

LVFNFLGASMI PLI PMFLKQSLFFS LALTGMTT LVLTVALGV PVMKWL KRKNYRDY I H K E 
YCEKLEMLHKDKAEVPTGGGVLLFISLIASLLVWLPVTCKFSTWFFIILLTCYAGLXWYDD 
RIKIKRKO^HGLKAKHKFMVQIAIAAFTLIALPYIYGSTEPLOTLKIPFMEGMLSLPFWL 

G KVFC LGLALVA I IGT SNAVNLTDGLDGLAAGTMS F AALG F I FVALRSST I P I AQDVAYV 
LAALVGACIGFLWYNGFPAQLF^OTGSLLLGGLLGSCAVMLRAECILWIGGVFVAEAG 
SV I LQVLSCRLRKKRLFLCS PLHHHYEYQGLPETK I VMRFWI FSFVCAGLG I AAVLWR 

CPn_0901 1033239 1034537 

murD-Muramoylalanine-Glutamate Ligase 

FCMRRSRY'SGCLMEIDMCQRILILGTGITGKSVARFLYQQGHYLICADNSLESLISVDHL 
HDRLLMGAS EF P EN I DLVI RS PG r KP YH PWVEOAVS LK I PWTD I OVALKTP EFQR YPS F 
GITGSNGKTTTTLFLTHLLNTLGIPAI AMGN IGLPI LDHMGQPGVRWEIGSFQLATQEE 
H I PALSGSVFLNFSRNHLDYHRWLDAYFDAKLRIQKCLRQDKTFWVWEECSLGNSYQIYS 
EE I E E I LDKC DALKPIYLH DRDN*f C AAY A LAN EVCWVS PEGFLKAIRTFEKPAHRLEYLG 
KKDGVHYINDSKATTVTAVEKALMAVGKDVIVILGGKDKGGDFPALASVLSOTTKHVIAM 
GECROT I ADALS EKIPLTLSKDLQEAVS I AQT I AQEGDTVLL3PGCASFDQFQSFKERGA 
Y FK LL I REMQAVR 

CPnJJ'tOJ 1034507 1035241 

nlpD-Muramui.i'je (invAsln repeat tomlly^ 

AVDQRNAti:>EVN^RR[>IV [TA^i/VNAILLVALFVrSKRIGVKDYDEGFPiJFASSKVTQA 
W:; EEKV I EK PWAEV PSR P I AKETLAAQF IESKPVIVTTPPV PW:*. ET P EVPTV AV P PQ 
PVRETVKEEOAPYATVWKKGDFLER [APAN1ITTVAKLMO LNDLTTTOLK TCQVI KVPTS 
yDV^/KK'IT'OT^vp^itpinYY IVOEGD.TPWTI ALRNH [RLDDLLKMNDLDEYKARRLKPGD 

t(::W <:>:\ t iMvLsiori Piot'-iti Ff.r.W 

:;Kn:;i ( :;MKWFv i:;ru/i i fsu ;l [MVFr/rr;sAL-:vLDR:xb:c:7niKAL E povtyl tuiuw 
a:;llymh™rdf t.K i::i-vll;;i :;-j,\al u:n- r f<'.u'; lcmh ;M<wu',vrw ;r iup::ki-vk 

YI.VI'IVAI.Yn;iM-'j:r:i,YUKOL.KMKI.K[;tVi[tJ ? tr'[LL[AtEPWIf;:w\AVl:JA:;i,ll'VFIM 
T:;VRI J<YWL.t.! , UlJ:VtaA(^;AI.AYRMI'Y7l*YRLWYLHPELO[Kf;U^M0r'YOAKf AACISC' 
Kl.f/ ;K( ;!^A;'l-gKLTY[,PKAOri!,Y fAAI YAKKFGFr/MLVL t LLYMCr/"^? 1YA [A IKAS 
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ILIANMCCVTLLLKV 



';LEGAAU\MVITLII3MOArMNLCW.'3CLLPnKO-/NLI 
YDEENnK^UXTRRFRRPHCPSSUJKGSFFS 

CPn 0904 l°)6320 10)7396 

rpn 0905 1037400 1039835 

I VDHF 1 1 DALHKFDKQQT I EQ AFTKEQDLVKR 
CPn_Q906 1040514 1039915 

E^AYIVK^NP^^^ 
NQEEQEEGNMHHSCECSNHH 

CPn 0907 1040816 1040445 . 

' *^Ea Periplasm^ Divalent Cation Tolerance Protein CutA (C- 

fm^skfxiikssi^aVlil^ 

PLSK 

CPC?9° 8 1041607 1040780 



dSlpSftkillRelamespki idlslsdaypgei IVTLSSGSLLRLPIKTLD 
raIJ^ykhmkkspviesekqyvydlrfpnflllkal 

CPn!lb909 1041592 1041966 

N^fFY^ YI S SAG I RLL LSNFKLVQSLGGKMCLCCVKESVTEVMRI AGLDQL I LLCQSEQy 

clskL 

CPnjmO 1041970 1043004 

miaSVtRNA Pyrophosphate Transferase 
FLYMLPFEFEFNTTSS PECDVCLDPQKLFVKLFKRT I VLLSG PTGSGKTDVSLALAPMI D y 
GEf^SMQVYCGMDIGTAKVSUCARQ^^ 

lsrWpilvggsgfyfhaflsgppkgpaadpqireqleaiaeehgvsalyedlllkdp/ 

Y A(§¥i T KNDKNK IIRGLEIIQ LTGKKVSDH EWD I VP KAS REYCC RAWFLS PET 
QMRCEAWLQEGLLEEVRGLLNQGIRENPSAFKAIGYREWIEFLDNGEKLEEYEETKR 

SNSWRYTKKQKTWFKRYS I FRELPTLGLSSDAI AQK I AKDYLLYS 
CE 4|| 9 U 1044079 1042985 

Ferli'iicluster oxidoreductase ™m 
SLl]£AlFNVNYFMNLCKRI SFEEGLELFVSSP I ERLQERADAIRKERYPSNEVTYVLpAN 
PNfYTNICKIDCTFCAFYRKPKSPDAYLLSFDEVRSLLQRWSSCAmTVl.WC^PGLCI 
DYLEELVR ITVQEFPS IHPHFFSAVEI EHACRVSGI 3 IEQGLQRLWDAGQRTyt PGGGAEI 
LSERVRKIISPKKMQPGCWINLHKIAHU^ 

SCPGFYSF I PWSYKPGNTALRRNVPQQAS I ETYYR I LALGRI FU3NFDHyjfeSWFGEGKS 
LjGAKALHYGADDFGGVILDESVHKATGWS IQSS EEEICNI IRS EGF I PVHRNTFYQH I SC 
TVSSL 

CPn_0912 1044120 1045760 

CT768 hypothetical protein / „ Tri ,„ W r,c. 

WIMDNSDNSFHTLETECGSFLNDELAVEEVASTESTEISDATLCSAEKKVAFILNKMRE 

ALTGSSQGSDLRLFWDLRKOCLPLFNEIEDTAKR.\DHWRCYIELT/EGRHLKCLODEEGS 
FVVGQI DLAITCLEKD I LKFQEGTEDKI FKDREDNFLESQALDK^QAFYKQHHTSLLWLS 
SFSSKIIDLRKELIWGH[WRLKSKFF0RLSNU3NQVFPKRKE/IEKVSC^FAEDVDAFV 
AKYF1GSDKETLKKTVFFLRKEIKNL0HAAKRLFV^S3HVFAE7RLKLSKCWDQLKGMEKE 
IRQEQGRLRWSAE^SKEVRQMLAEVSSLLIEGNDLSKVRK^EGISKKIRALDLTHDDV 
irSLKKEMQQLFDOLREKQDAAEHSYQEOLAKDKQWKEAAR^LAERITTFSKTCSEGNIT 
SE^REEVKJTLKELLGKMSFLPPPEKISLDNQLNlALOTIV«FFEEQLLSSPDSREKL\^fM 

rcvlkqrrerrqelkdkleqdkkllgssgldfdramqygXlveedkraleeldasilelk 

OOIOOLL / 
i.'Pn_001"l UM570'» L0-15'M5 

No robust: homoloq present in Gen^ban/./EMBL rir, ot 11/7/ 
K 1 1 IC :K Y KR [ EATD: j A I AMR RNC I YAFDLDCTLLKG^.'JfJW.'l FYC YrXU.XILFSYKTLPP( . I 
YHKFHFKFFFOtFHrSI [R 

(Til O'.'M KUVM)'.» , . 

No 7ohu:;l homnloq pr.:Ji.>nt ill t;«t/?lMr.k/EMlJL of ll/7/'»K 
VFFWDLF'flFYY.'I IVTf<I .l^'SVPCDDLYEVALJ^FV.^TI ,T f ':'DFYAPVLEKLEEAFADTrGO 

v[u-v:r;:;KJFrviuHAOQu;is:iWA:VY^^ 

,\\vm itf: ;nn [ i,ui .ri'LMijGEEKTwn w^kkmakkyywn t v 



yb..-B-ioj ; >P ■fVK^l^BFDLLKVAAIt^ [DDKK GNNLWtDVBTISEFTOVW 
RLEELWKDGFIVTGKLLAo 

CPn 0916 104681) l^^ 4 

CPn 0917 1048064 1048539 

K^roEA^RELVEETCLSVw FF PKVL I EQYS FNNEEQVFVRKEVTYF LAEVRGD I HAD 

pmeicdscvi^lqeglrli^pelrdltveadkfinnylfss 

CPn 0918 /o49232 1048579 

NFCPC LYG LLPOTYCQTASGNYSG EQTRREG IQGDKDPLDVCVLTEKN I HHGN I LLQAR P 

SPAKIErVGIYGKK^QKVIQLAHEDYLSYIGDTAEVN 

CPn 0919 / 1049375 1050430 

f^sSeX™ 



QGIGSVG1 
RGNVII 

egrvyapj 



dlnckaivgvannqledssagmmlherg I lygpdylvnaggllnvaaai 
lpivlsklynqskttgkdlvalsdsfvedkuayts 



CPn_09fc0 1051423 1050431 

t fi^t vnPTnTTAGF IRHRAF AVAI SLIYEYRP I LSVMACPAYNQTFKLYSAAKGHGLSIV 

SE^f^p^^ 
asgdqethettlaalqnqlnvvptdkl i al / 

r CPn_0921 1051526 1052293 / 

YK^pISpAFKEAFRMX^^ 
WFNQGDDNLPIEVPYA 

CPn„0922 1052266 1053927 

r S^YRELRNA I WV^lKVSKFS EDRVGVMMPAS IG AF I AYFG I LLAGKTPVMMNWSQGL 

KCSVPWLLRIFGVSCA/ESDC^AVILFTSGTEKLPKAV 
DVMLAFLPPFHAYGFNSCGLFPLIJ4GVHVWASNPLNP 
DYILCTAKKQNSCLESLRLWIGGDALKin'LYECT 
TKESPR^CWMPIEGMDVLIISKETHIPVSSGECCLI^ 

GSL^GIPGDKVRUXFTTL^^ 

DYVSLMALAVSLFG • 

CPn_0923 1053966 1055093 

SWIDDLESKIASYHGAPNAFIVNSG^^ 

ISGQHHTFHHNNLEHLESLLQCYRI SSKGR I F I ^SSWSFRGTLAPLEQ 1 1 ALSKKYHA 
HL I VDEAH AMG I FGDDGKG LCH AIi3Y ENF Y AVLVTYGKALGTMG AS LLTS S EVKYDLMQN 
qpPLRYSTSLSPHTLISIGTAYDFLASEGEIARKOVFKLKEHFHECFDSHAPGCVQPIFL 

i/HINHEFHLWRELCCH 

CPn_0024 1057)01 1055028 

priA-Pr imosomal Protein N ' or^vv 
KRFTAKTKSMGY I ES5TFRLYAEVIVGSNINKVLDYGVPENLEH ^^CTAJ^ISLRGGKK 
7GV I YQ I KTTTOCKK I LP I LGL3DSE IVLPQDLLDLLFWI SQYYFAPL^TLKLFLPAIS 
3rA/IQPKQHYRVVLK0SKAKTKEILAKLEVLHPSC<3A\^KILLQHASPPGLSSL^ 
3QSPtH3LEKLGILDIWAAQLELQEDLLTFFPPAPKDLHPEQQSAIDKIFSS^ 
THLLFGITGSGKTETYLRATSFJU.KCCKGTrLLVPEIALr/QTVSLFKARFGKDVGJ^ 
KLSDSDKSRTWRQASEGSLR ILIGPRSALFCPMKNLGLI IVDEEHDPAYKQTEjPPCYHA 
PDVAVMP.CKLAHAT\VLC;SATP^LESYTNALSGKWLSRLS3RJWKAAHPAKISLI^LE 
REKSKTKILFSQPVLKK lAERLEVnEC/L IFFNRRGYHTrA/SCWCKHTLKCPHCDMVLT 
FHKYAIT/LLCHLCN^ JPKDLr^nCPKCLGTMTLOYRGSGTEKIEKILCOIFPQIRTt^ID 
"DTTK FKG3HET LLRCFAT^K ALVL I CTQM I AKGMNFSAVTLAV I Lr^DSGLY I PDFRAS 
EOVFOL[TnVAr;Rf.^RSIIIJXlF.[Linr;Ft J rDHPTIHSAMRODY3AFYr;OErTCRELCEYP 
PF [RL I f'C I FMGKCFKOTWKF^HRVHM I L K E(»H ^ E3TN P LM I'"/T PCG H F K [ KDTFRYQFLI 
(TjAYV [ p 1 7NKKLHHAL.MU\KL:;!'KVKFM [[IVDl'HTTFF 

rpv/'i hypor.h^r iv'.i I pi r >(■■■; in 

K.HWLFMEfJSONFHDTlxrOI.^nHYrJFJ^/^^t'l^n.LWrLPrrrAISAr^r^UPEKAV 

AtWITPPPETNL.L\>KKTKl":[WKi:VI'[JlPDL:«NALLKEKYPALKC/::;U- , AI'K[PCGt 

p-/YKFJiriEEVI.FFNRl.AKIi;iV-'i.f- , i'TKI 1 'rL,[IIAKTNrF , /r^lPNFF[ J AI-Art.hPyiHYKtP 

rrDYi igrjLTorJGf: i Ft .ply : '•: : i .l-;y f.k lkrhlwa [ lmrlpfayt i-k:;:' 



■ ■I'll O'tt'i 
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i:pn_0'»J 



ThiorcHoxin Disulfide [somer-*se 



CHHTCOTyLtR F FK DS DMK FVU/^AFVGCLLLTLPCC/^^WC^SGE>^CC^RPIAA^^ 
^KWO^EORQKNOELKAQYKVTGFPELVFIDAEGKQLARMCFEPC^MWSKVKSAI/ 



' . n !' l .v -iU., 1 ;^,",; %'yr!;Tv^ rrrwmvK 

SA^RWtQYDDLWDSLAIKIPHALPHRWILYSQGNSGl2iENLFDRGDSSLHQ^TG 
SN^VFNYPG IMSS KG EAKR ENLVKSYQACVR YLRDEETGPKANQ 1 1 AFGYSUjTSVQAA 
PEIFIYNSNHDQELISIX;LFERENCVATPFLELPE\nCTSGTKIPIPERDlXHLNPLSPNV 
VDRLAAVI SNYLDS ENRKSQQPD 

CPn_0926 1061035 1059884 

SFKDPAVSES^TIQQDHLTIOTIAIHFSTARPKRWLLISLGSGDFLEW 

YSLC^SAALQWPFTNSETSWAVTO^ 

SS PLKH PT IQKLAEAI LESLSRKN 

CPn_0929 1062301 1061186 

*CHLPS 43 kD<a protein homolog 4 

EKFMAPIHGSNAFVEDILHSHPSPQATYFSSTRAQKlJiEFTtDRHPVLTRIASVIIKIFKV 
L IGLI ILPLG I YWLCQTLCTNS I LPS KNLLKI FKKQ PNTKTLKTbT^tJiAWDYSSKNRyA 
SMRRVPlL^^^LEICLSQAPTNRWMLISLGSDCSLEEIACKEIFDSW 
ANILVYNYPGVMSSTGSSStJCDL^AHNICTRYIJCDKEQGPG ITYGYSLGGLIQAE 
A^RDQKIVANDDTTWl AVKDRCPLF I S P^^^^^^^^^^^^H^^^Y?^?E^£S 
LE IFLY PTD S LRRSTVRQNKLLAPELTLAHA I KNS PYVQNKEF I EVRLS SDIDPIDSKTR 
VALAT P I LKKLS 



No robust homolog present in Gener^k/EMBL as of 11/7/98 
NKMSELATCSTGLQMWHTQ^ 

VIGSMILILFSS IALIYLYKKTREVDQIALEPLPEMISKDQS I IDFVKTRDYASLEKKAT 
FAYTirrHYYDGSMVFYREIPRFMLGSYLALRKDMDRQALF 



1062851 1063330 



&931 



1064078 1065718 



lysS^Lysyl tRNA Synthetase 

CEftfe'FASQELGNSEAAMSRSTPRVRFAGRLVLF 

EFTSVHGLSEDAEITPIKFIEKKI^LGDILGIDGYLFFTHSGELT^^ 

LPDKJttGLSDKEVRYKKRWLDLISSREVSDTFVKJ^YII^ 

NlifGGAEAKPFTTTMEALHSEMFUlISLEIALKKILVGGAPRIYEl^KVTRNEGIDRTHl^ 
PEFTMI EAYAAYMDYKEVMVFVENLVEHLVRAVTfflDOTSLWSYWKHGPQEV^FKAF 
fTT!^ : ESIATYAGIDVDVHSDQKIJ<EILKKKTTFPETAFATASRGMLIAALn3EXVSD^I 
APHfJITDHPVirrTPL£KTLRSGDTAFVERFESFCI£K^ 

TKREBLPDSECHPIDEEFLEALCQGMPPAGGFGIGVDRLVMILTNAASIRDVLYFPV 
FD^EKTN 

CPit-6932 1067160 1065721 

cysS-Cysteinyl tRNA Synthetase 
VKS&FVMAFSH I EGLY FYNTASQKKELFFPNHT PVRLYTCGPTVYDYAH IGNFRTYVFEQ 
ILKRTLWFGYSVTHVMNITDVEDKTIAGASKKNIPLQEYTQPYTEAFFEDLE3TLNIA" 
DFY^PHATHY I PQM IQAITKLLEQG I AY IGQDASVYFSLNRFPNYGKLSHLDLSSLRCC2R 
I SADEYDKENPSDFVLWKAYNPERDGVI YWESPFGKGRPGWHLECS IMAMELLGDSLDfl H 
AGGTONIFPHHENEIAOSEALSGKPFARYV^SEHLLIDGKKMSKSLGNFLTLRDLLftQE 
FTGQ^RYMLWSKYRTQLNFTEEALLACRHALP-RLKDr/SRLEGVDLPGESPLPRTLDS • 
SSQ^EAFSRAU^LNVSTGFASLFDFVHEINTLIDQGHFSKADSLYILDTLKiy 
GVL^TTSVCIPETVMQLVAEREEARjCTKJWAMADTIJ^DEILAA^ 

CPn_0933 1067532 1068578 

predicted disulfide bond isomerase 
PV I LLQN I KRCS LKQLKVLATLLLSLSLPTLEAAENRDSDS I VWHLDYQEALQKSKEAEL 
PLLV I F SGSDWNGPCMKI RKEVLES P EF I KRVCGKFVO/EVEYLKHRPQVEN IRQQNIAL 
KSKFKINELPCMILLSHEEREIYRIGSFGNETGSNl^DSirHIV^DSLL/RAFPMMTSL 
SLSELQRYYRLAEELSHKEFLKHALELGVRSDDYFFLSEKFRLLVEVGKfpSEECQRIKK 
RLLNKDPKNEKQTHFTVALI EFQELAKRSRAGVRQDA3QVIAPLESY ISOFGQQDKDNLW 
RVEMMI AQFYLDSDQWHHALQHAEVAFEAAPNEVRSH 1 SRSLEY IRHC^ 

CPn_0934 1066948 1068526 

rnpA-Ribonuclease P Protein Component 
YFVHPLTLPKOSRVLKRKOFLYITRSGFCCRGSQATrr/VPSRHMTCRMGITVSKKFGK 
AHERNSFKRWREVFRHVRHOLPNCQIWFPKGHKQRPVFSKLLg/DFINQIPEGLHRLGK 
TKATTGGECTPKSEKCVTAPR 

CPn_093^ 1069100 1068957 

[134-L34 Ribosomal Protein 

EDTVKP.TY0P5KRKRRNSVGFRTRMATRNGRKLLNP.RP.RHq 

CPn_0'J36 I0tv?330 1069470 

rt3(-.-Lih RibOLionuL Protein 

7L.MKVr;:;:;VKADP::KGDKLVRRKGRLYVTNKKDPNPKQ/.QAGPARKK 

..Pr,_[r.H7 h).v>4H7 lOti'.H'tK 

r:;M-;JM K L bosom, 1 1 Protein 

VK[<MAKK::::VAI<FAKIU<RLVEANFKKR:;DLRKrVK/L.';vrXEEKENARI;;LNKMKRDT3P 
1 , IU.ilHI'fM.i;Pt;RtM^;Yl.riKFAI:;RtCFRuMAJMt , ;jfIP';V[KA:;w 

■ <)■• in ur:o v, r . uu/<>m/ 

f-TVHrt tr/i-.r h-t Lt-.il |wor. : in -[l>:-j/or "'/)) p.:pt ul,; ivr ipl.ismicl 
».TINI.: 'Tl'l .UTMl.I' T:' [ I..LFYV t LfJOt.lIAY tflOKKKPir/ r'lWFFAi UWVuF KXWLLLL 

KiHWiAi ,i-:i<t vrir<PFnN.':ni..FDDi.KK. f ;LA<:M6E i p:;:/;dloe rv iijtekwfylnkdrenv 

';i'l:;FKi:[.VVI.[.K',KTYpr-|:iWVWKK( ;MKL)JWQKVKL7KX(A>ALKF.ASK 



CPn_09 3'.' H 

hTnr^Xsltli Igt^yff^eiel/gcxk^ekqnlkldvkeiefpetvfsrdietr 
v oviilh^ingvsll^^ 

VSIPEKTK^rVSEISE^ 
DFLLENSEC 

rPn n<M0 107301/ 1071204 

fr T PFLMKKTAS IETI WSNETEALLLENNLIKCHHPKYNVLLKDDKTFFCLAISLSHSW 
DMK^CLAPCVGYCTPEEYOGTjLDK^ 
E^DLLSSFIWYWSQPy/p^ 

LAYRNAKAYAATTLPS STL/PYQDFQN I LRMSQY PYR I EC YDN AHMQGAH ATG\T I VFENN 

gfdpkottfsidsekt^i^ 

TLNLTCIQ^IAKEKS^ 

iskhwckrgkalfeqe^^^ 
llarqkdfnksd 

CPn 0941 / 1075504 1073018 

mutS-DNA Mism 

VMTEltKPTPMMEO _ _ - 

SGIFVSTVDTYvfiilG^ 
LLQEKFNNY IVAINRIGSLFGFACLDLSTGSFF I EECENTKELVTJE ICRLAPSEVLSCNK 
FWK^ AI VMotOOHLKLT LSTYAEW AF EHKFASQKLTTH FQ VAS LOG FG LKGLVP A I N A 

JJiO I L I S P FYNPKE I LVRQDAVEFFLRQ VTLRKN I KTYLCQ VRD I ERLM 



AGGLLSYIQD 



TKVTTG 
NGDLPLR 1 

gyyievsse; 



.LPTKiilAIPC/TRGKCXJKI^IITrASQvNLELLAPL^PQGKNSLLRIM 
10 1 L I S P FYNPKE I LVRQDAVEFFLRQ VTLRKN I KTYLCQ VRD I ERLM 
'GTUU3SFSAGAQIYEQLASATLPEFFIDKCSLDTKLASLIALLSKSL 
;DGNIFVDEFH>IDLKiUJlHNQEHSQEWIWEYQERIR^ 
« 1 A& ™£FA*}LPKi>FIRRQSRU4AERmiEL^^ 

cshiwSteii^qsladldyiisladlahacx^crph^ 

VDTGKFIPNDTEMRGSQTRMI LLTGPNMAGKSTY I RQI ALLVIMAQMGSYIPAKSAH IGV 

idkietrigagdnlskgmstfmvemaetanilhnatdrslvildevgrgtstydglaiaq 

avve&lftdkjocaktlfathyk^ttl^ 

gihvarlagfplcwsraqqilrqlegpesitrpaqdkmqqltlf 

CP/i_0942 1075955 1077754 

i»C S ITKLRTAMYTEES LDNLRH SID I VDVLS EH I HLKRSGATYKACC PFHTEKT PS F IVN 
P AG AHYHC FGCG AHGDAI G FLMQHLGY S FT EA I LVLS KK ^^^Q^*^^?^ ^f^9^ 
/EELJWINSEAETFFRYCLYHLPEAPnHALQYLYHRGFSPDTIDRFHLGYGP^SLFT^A^ 
ERKISQEQLHTAGFFGNKWFLFARRI ifpvhdalghtigfsarkflensqggkt^pet 
PIFKJCSRILFGLNFSRJWIAK£KKVILVEGQADCL<3MIDSGFNCTVAACCT 
LSKXGVIJCVFLLFDSDF^NKAAI^vGDLCC/rAQMSVFVCKLPC^HD 
I ALLEQ SQDYLT FL I S EKM S SY PKFG PREKALLVEEA I RQ I KHWGS P I LVYEHLKQLAS L 
MMVPEDMVLSLANPQVTAEPQNI P IKQKVPK I H PH IVMETDI LRCMLFCGSNTK I LYTAQ 
FTFVPEDFKHPECF^FAFMISYYEKYRXWPFDEACQVLSDSQILQLXTKRRI>JTE^ 

tifvqslokmadrrwrecckplslnqniqdkxleiledyvqlrkdrtiitlldpeselip 

CPn_0943 1077972 1078238 

CT794.1 hypothetical protein 

ffmksfkfllpflsvilgcgnllssprsraisvtesigmsavktlvlsekaheflegigy 
gvgassilriwqtqqwleiesllaqnevn 

CPn_0944 1078503 1078997 

No robust homolog present in Genebank/EMBL as of 11/7/96 

I K IMMHRYFI PLLALL I FS pslvraelqpsenrkggwptqlsc aegsqlfckfeaaynna 

IEEGKPGILWFSERPTPEFADLTNGSFSLSTPIAKGFTrVWl^PGLISPI^FFHKMDPV 

ilyvkjsflemfpeveavsgprlcyilideqggaqcqavlpletkn 

CPn_0945 1079001 1079660 

CT795 hypothetical protein . 

S I fknkilpsyfghnfdqlrrhymri alsllsllmi fpi fgeesrpgsedgnsntqeivg 
SQDTQVCLYHSYEQGLQASR I egkplvi wlcnsgddgqact iglsetceevlsvlsgs I 
fselanfwlvpsgvnpliypp I EDP ilaeivkfkelfkdesfptglsi iwgvtpegpg 
. diievspvsltveeeetlpseqttevestselqsedpaia 

CPn_0946 1082816 1079745 

glyQ-Glycyl tRNA Synthetase 

GECQKKKCYTLESFVSEHPLTLQSMIATrLRFWSEGGCVIHQGYDLEVGAGTFNPATFLR 
ALGPEPYKAAYVEPSRRPQDGRYGVHPNRLQNYHQLQVILKPVPENFLSLYTESLRAIGL 
DLRDHD IRF IHDDWENPTIGAWGLGWEVWLNGMEITQLTYFQA IGSKPLDTI SGEITYG I 
ER I AMY LQKKI S t YDVLWNDTLTYGQ ITQAS EKAWS EYNFDY ANTEMWFKH FEDFAEEAL 
RT LKNG LSVPAYDFV I KAS H AFN I LDARGT I SVT ERTRY I AR I RQLTRLVAD S YVEWRAS 
LNYPLL3LSSTSEPKETSESWPMISSTEDLLLEIGSEELPATFVPIGIQQLESLARQVL 
TDHNIVYEGLEVUSSPRRIjALLVKNVAPEWQKAFEKKGPMLTSLFSPDGDVSPQGQQFF 
ASCGVDrSHYQDLSRHASLArRTVNGSEYLFLLHPEIRLRTADILMQELPLLIQRMKFPK 
KMVWDNSGVEY AR P I RWLV ALYG EH I LP I TLGT 1 1 AS RNS FGHRQLDPRK 1 3 1 SS PQDYV 
ETLRQACVWSQKERRM 1 1 EQGLRAHSSDT I S A I PLPRL I EEATFLSEH PFVSCGQFSEQ 
FCALPKELLIAEMVNHQKYFPTHETSSGAISNFFtWCDNSPNDTIIEGNEKALTPRLTD 
GEFLFKQDLQTPLTTFTEKLKSVTYFEALGSLYDKVERLKAHQRVFSTFSSLAASEDLDI 
AIQYCKADLVSAvVNEFPELQGIMGE^LKHANLPTASAVAVGEHLRHITMGQKLSTIGT 
LLSLLDRLDNLLACFILGLKPTSSHDPYALRRQSLEVLTLV.^ASRLPIDLAS LLDRLADH 
FPSTIEEKVWDKSKTIHEILEFIWGRLKTFMGSLEFRKDElAAVLIDSATKNPtEILDTA 
EALQLLKEEHTEKL/\VlTTTHNRLKKIL3SLKLSMTnSPIEVLGDRESNFK0VLDAFPGF 
PKET^AHAFLEYFLi'LADL^NDIQDFLfrrVH IANDDGAIRMLR T5LLLTAMDKFSLCHWE 
.^VAV 

^Pn_0''47 10SMJ3 lOHIO'yt 

rx?sA Glycerol t !• Phosphatyiy Lr. r.in:;L*?t.iso 

G.'iRWJLPNY TTF:;RLF ITr [ KM [LYLKHKWK; rTPWI.I'Y^yLLAL.LAIfJKLTDA IIX'WA 
RKF:'JO'/TDUJKLLDPMAIW I VU f:; I YLTFTOf't'VTJI.r'LLL'/Fl F( j\Rt):JV(:"TI>RT\'CAF 
i'fIRWAARA:;CKLKArUx 'VSKI-I .TLLVM Il'll.'Il/ ILL: WILE! l-'A::vrVS t lAVYStA^ 
G I EY FWMNKNKL: H.»R AKTK t \ * i!Kf !l I K.'J K U 

':PiiJ>'"1H U)S'.4Hi MiH/1047 

'jL<jA -';lyt:i*i<:n :>vut h. ■ 

' ;n: 'mu t vovavki-ti > i vkvi -a-.u iijava: ; i .: :k i:i .akqni jv kvllpi iy pi . i : :kk: ;ovi .:*r 
psfyyeflgkuoa::a i : :y:;y w :r.:n;r 1 1 rt .n: :u i p.LF.'rpr. , ;v^:t'WNWUK::AFAAAAAA 
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YLoEADPADI'/IlLHOWHVnLLAGLLKNPLNPVHSKIVS^BKYRGYCSTQLLAASQID 
n^L;tlV0LrRDr0T3VLMKGALYC3DYtTTV3LTW^^»3DYELHDAILARNSVF 
•r l iw; iDEDVWNPKTDPALAVOYDA^tiSEPDV^FTraMBAVLYEKIXSISSDVFPLI 

TV [3R rVEEKGPEFMKEI I LHAMEHSYAF I L IGT3QNEVLLNEFRNLQDCLASSPNIRLI 
LDFNDPLAR LTYAAADMIC I PSHREACGLTQL I AMRYGTVPLVRKTGGLADTVI PGVNGF 
TFFE7TNNFNEFRAML3NAVTTYR0EPDVVOJLIESGM1JIASGLDAMAKHYVNLY0SLLS 



• 

3KCF3MDC 



45 



«-Pn_no.i«i 



10R58R7 lORMfll 



i.-t.'. i ; -v M -v< :;-:rjn i mki.t/t: :i> ft' iff: :k i .kk ! r-^:- « a f -avvy: :a. v/.'i .an iTvnALVF 

KKFLSNLESGALSSTVF3LSYEGRI IKALVKDIQYQITTYDVIHLDFEELVEDRPIKLNI 
PIRC INAVDC IGVKLGGSLRQVIRAVRWCKPKDIVPFLELDVRSVGLSQTRKLSDIKIP 
AG I ET ITPLKEVAITVSRR 

CPn_0950 L0B6470 1087027 

pth~Peptidyl cRNA Hydrolase „ 

PSLEDNMAKLIVAIGNPRHGYAWTRHNAGFtJjADRLVEEWGPPFKPLSKCHAIifrLVES 

SSGPLVFIKPT7FWLSG1<AVVIJU<KYFNVALSHILVT-AD 

NGLKS ITASIX3SNEYWQLRFGVGRPLEEGVELSNFVLGKFSEEENLQLGS IFVEASTLFT 
EWCSKF 



CPn_0951 



1087113 1087457 



rs6-S6 Ribosomal Protein 
■EFIJ^KKEMQLYEGAYWSVTLSEEARRKAIJ3KVISGITNYGGEIHKIHDQGRKKLAYTI 
RGAREGYYYFIYFSVSrcAITELWKEYHU^LLRFMTLRADSVKEVLEFASLPE 

CPn 0952 1087469 1087723 

rsl8-S18 Ribosomal Protein 
GEINMNKPVHNNEHRRKRFNKKCPFVSAGWOT 
RFQGVLSQAI KRARHLGLLPFVGED 

CPn_0953 1087727 1088248 

rl9-L9 Ribosomal Protein . 

FKGRRMKQQLLLLEDVDGLGRSGDL ITARPGYVRNYLI PKKKAVIAGAGTLRLQAKLKEQ 
RLIOAAADKADSERIAQALKDIVI^FQVRVDPnWNMYGSVriA^IIAEAAKKNIFLVRKN 
FPHAHYAIKNLGKKNIPLKLKEEVTATLLVEVTSDTiE^ 

CPn_0954 1088259 1088708 

ychB- Predicted Kinase 

GRKVCYKDIMQYFSPAKI2JLFLKIWGKRFDNFHELTTLYQAIDFGDTLSLKNSMKDSLSS 

NVNELLSPSNLIWKSLEIFRRETQIHQPVSWH 

FQTHIPITTLQLWAREIGSDVPFFFLQEQH 

C&8o955 1088612 1089175 

(f rime-shift with 0954) 
RAlriPNPYSYNNIATLGSRNR^ 

GI^EKAYQSLLPQDYSTG>^ACFTGE2TOLEKSVFRIRTDLKNKKHMLE2RMWSPFESW 
L^SGATLFVCYLEELEQDSKVSSQIHSLIKQTQGIPVSW-YREPHWYSLKQSTYKNSP 
LECgQPQI 

CP^0956 1089545 1090909 

CKG5 hypothetical protein 

LVWESM ILPPYSYSLK IGAAVLFFC S I LHT FLT PWLYTLCQSYEHKKLVF PECWKRYARL 
SEiUERILSRVEIVFFLWAVPLFFWFLYTEGYRISMAYFNSRNYGFAVFIMVILILLESRP 
I WflAELVLSS I AKLGKTSPKSWWWTLMI APPLLSCLLKETGAMI IGATLLMRHFYVFS P 
S RRF AYATMGLLFSN I S IGGLTSYVS SRALFLI FPALKWEHS FFLSHFAWKAIVAI LIST 
TfYYFI FRKEFKKFPDI PSDKDPSVEKVPWWI ICVNI IFVGS I ILSRSTPLFMGALLLFY 
LGBQKFT I FYQDPINLSKVCWGLFYAGLVVFGDWEWVU^CGLSDFGYMWSYTLS 

i fiIdnalvnylvhnlsvatdcyhylwagcmaaggltlvsni PNIVGYL ILRSAFPSST I 

rofa&FLGALGPS I ISLGWWLLKNVPEFLYCFFR 

G£f^0957 1093812 1090963 

ideTptr-Insulinase family/ Protease III 
K I&tfRNCKMFWKLLCP I L I CTSLS ITSCEQQFKWPNQC PLQVSTPAAADQK I EKI ICSN 
GLPiLI I SDPNLPTSGAALLVKTGNNADPEEYPGMAHFTEHCVFLGNEKYPEVSGFPGFD 
SEtySVHNAFTYPNKTVFVFSVEHSAFSDALDQFVHLFINPKFRQEDLDREKYAVHQEFA/ 

ahpis dg rr vhr i qqlvapqg hpcar fgcgn astlt pvtt ekmaewfkl kys penmc a i a, 
yt^plskakkqfskifsqiprsknyerqepflpsgdtsslknlyinqaiqptsnleir 
h i yess h p i plccykaiaevlrnes knslvsllkneql itoldveffrs sltttgefy 1 3 
eltekgdkhysqvidstfqylryiqehgipnytleeistinalnycyssksplfdllckq 
ivsi^nedlstypyhslvypkyssedesallnlvsdpeqarfvlssknsehweeatc^hd 
pi fdmtyyvkaldgvqdygkvqslkpi alpkpnlfi pkevtlpgvhllkkqefpfa9als 
yqddkltlyhcedhyytapklssqirirspoisrsspqflvatelyclavndqllreyyp 
atqaglsftsalggdg i dlrvsgytttvpallns iltslpnlei syetflvykkqzlely 
ogalujcpvrsgldelasqvwetysnttklsaleklsfsefqafasnlfnsvhu 
gnlseookkdylemlqvftasrsshatkpfyyelqsqeiseihhdypltangmlLllqdk 
3 s ps iogkvcaemlfewlhh itfeelrtqqqlgymvgaryrefasrpfg fly irs days p 
eellaktslflnkvsaspekfgisqekfanirkayinkilepehsldmmnsa^fslafer 
pfvefstpdlkiaiaetltyeeflkycocflsnelgtctsvyirgtqkts 

CPn_095*3 1094803 1093793 

plsB -Glycerol -3 -P Acy transferase 

r yrai ymqfsrylryafdnqylpeplyqkfsvfhqny idaatkkaaadqAevlclqwvkv 

riEDLWJPFIFPPYHKKIRAPIDLFRLSIDFFSLVIDDKNSRILNLHRKKEIEEYIARGD 

nwllanhqtecdpql>tyyalgkthpelmehwifv^ 

krhratppelreekllhnoksmqilktllneggkfit/apaggrdrldaegrlypsefsp 
e3 [evfrllakasnotthfypfalktydi lppppkiemaigeqraipapvffnfgaelf 
fdalc::keelihcdkhaqrtlraekvfstvknlyeel 

'.T-tiJJ'j'j'j H)')bilh I0 l .»4799 

'j.it'E-AxL.il FLLimonr. Protein 
AUCYGI f JTRKVMENEILLNrE3KEIRYAHLKNG0LFDLTIERK/VR0LKGNIYRGRVTNI 
t.HN rO.SAK IN I DERENHF I H I3D I LEN:'KKFE0MFDMDVDALPEEAGEAPLL33EEAP [ E 
RFLKLL:;rV[,V(jWKE[' [Cj:"KGARLTSN[:;iF J C';RYLVLLPN3 l P[lRGV3RKIEDPHMREQL 
KijL I k^FRMPOOM' IL Ii.'RTA3TTA3TEAL [NEAHDLLLTWtyT ILEK.FYSTEQPCLLYSET 
I j I LKKA7 f ft: [ DKMYKRLL r DDYATYQKCKHMLKKY3PDAlS IK I EYYRDS I PMFERFN [ E 
K El DKATHHK rWLS:JO*YLFFnKTEAMHT [DVNSGP.3TQUE3GVEETLVQINLEAAEE I A 
I^JI'Rf'l'NVflGLV [ IDF [DMKSRKNQRRVLERLKEHMKYdAaRCTIL3MSEFGLVEMTRQR 
NRF..':i J^yPLI-TLC.'PYt^i'tJNA 1 1 KTPE3W T E I ERDLKKV INHKEH3HLCLWHPEI A3YM 
K0KH[ J OfimiNLAK0[.K..\Kt,0rNTrJD:-;VHLNHYQFF3^ITGE3rDL 



CPn_0'.ihn ll 

^LV3YLsTpOK^^ t FRLKLPODTER I3Y3 i3PEYI REKGE 

WS^DCRPLIROEL^ESDCFEECSC^PERKNILKFLEDRKKHEGNSPFEYL 
CPn_09ht 1097 10S 10, 

rM2-L32 Ribosomol Prnrei / _ _ ^ 



CPn 0962 1097301/ 1098275 

plsX -FA/ Phospholipid Synthesis Protein 

IL3DFMEVQ IG I DLMGGDHS PLVWQVLVDVLKSQS ST I PF A FTLF ASEE I RKQ I QEEF I 
SDLPQEKFPKI I SAENFVAMEDSPLAAI RKK3 SSMALCLDYLQEDKLDAF I STGNTGALV 
TLARAKIPLFPAVSRPALLVCVPTMRGHAVILDVCANISVKPE 

DSKIPTIGLLNIGSEERKGTEA/tRQTFRMLRETFGEAFLGNIESGAVFDGAADIVVTDGF 
TCNIFLKTAEGWEFLORILG^KLEADIQRRLDYTFYPGSWCGLSKLVIKCHGKACGSS 
LFHG I LGS INLAQARLCKRII^SNL I 

CPn_0963 l/9B374 -1103224 

pmp 21 -Putative Ou/er Membrane Protein 

TPLRFKVAMVAKKTVRSYRSSFSHSVIVAILSAGIAFEAHSLHSSELDLGVFNKQFEEHS 
AHVEEAQTSVLKGSDP^PSQKESEKVLYTQVPLTQGSSGESLDLADANFLEHFOHLFEE. 
TWFGIDQKLVWSDLI^RNFS0PT0EPDTSNAVSEKISSt7rKENPJ<DLETEDPSKJ<SGLK 
EVSSDL PKS PETAVAjyi SEDLE I SEN I SARDPLQGLAFFYKNTSSQS I S EKDSSFQG 1 1 F 

sgsgansglgfenlkapksgaavysdrdi vfenlvkgls f iscesledgsaagvni wth 
cgdvtltdcatcldzealrlvkdfsrggavft arnh evqnnlagg i lswgnkgai wek 
nsaeksnggafac(kfvysnnentalwkenqalsggaissasdidiogncsa i efsgnqs 
lialgehigltof*gggalaaqgtltlrnnawc<vk^ 
vafkqntaaltg^alsandkviiannfgeilfeqnevrnhggaiycgcrsnpkleqkdsg 
eniniignsga^flknkasvlevmtqaedyagggalv<;h>a/i^snsgniofignig<;s 

TFWIGEYVGGGAILSTDRVT I S^SGDVVFKGNKGCCLAQKYVAPQETAPVESDASSTNK 
DEKSLNACSHffiDHYPPKTVEEEVPPSLLEEHPWSSTDIRGGGAILAQHIFITDNTCNLR 
FSGNLGGGEHS STVGDLAI VGGGALLSTNEVNVCSNQNWFS DNVTSNGCDSGGAI LAKK 

vdisa^sv^sngsgkfggavcalneswitdngsavsfsknrtr 
ticgnc^^/afkenfa/fgsewrsgggaiianssvniodnagdilwsnstgsyggaifv 

GSLVASE^NPRTLTITGNSGDILFAKNST^AASLSEKDSFGGGAIYTQ^KIVKNAGN 
VSFYGNRApSGAGVGIAIXX7IVCLEAF«3DIL^ 

v^dkni/fqdaityeentirglpdkdvsplsapslifnskpqddsaqhhegtirfsrgvs 
kipqi^iqegtlalsqnaelwij^lkqetgssivlsagsilrifdsqvtjssaplptenk 
ectlvJagvqinmssptpnkdkavdtpvladi i s itvdlssfvpeqdgtlplppei 1 1 PK 
ctkl$skaidij<iidptnvcyenh^ 

vslfs itpatyghtgwseskmedgrlwgwqptgykln pekogalvlnnlwshytdlra 
lkg^ifahhtia0rmexj3fsttjvwsglgvvxdcqnigefdgfkhhltgyal^ 

DFLIGGCFSQFFGKTESQSYKAKNDVKSYMGAAYAG I LAG PWL I KG AFVYGN INNDLTTD 
YQTLGI STGSWIGKGF IAGTSI DYRY I VNPRRF I SAIVSTWPFVEAEYVRI DLPE I SEQ 
^EVRTFQKTRFENVAIPFGFALEHAYSRGSPAEVNSVQLAYWr7mU<GPVSLITIJCDA 
jtYSWKSYGVDI PCKAWKARLSNNTEWNSYLSTYLAFNYEWREDLI AYDFNGGI RI IF 

^CPn_0964 1104812 1103301 

No robust homolog present in Genebank/EMBL as of 11/7/98 
QSILES I IKYFYLIHNSKMHMSNPISLFSPAELIAKYNLI PKTSPIYPRRTELI ILEENA 

s^qtrltnvaqvlhpsslfsmskkilnpcgcsggplxtwilj^iij^iitsvlfiillpvnl 
vaglrlft1plppkkivedlseptteetneviqpfifalqallfednklrsfkiveqsvg 
aplpnpflnrlvai s pqesqeamrk i pdlcsoijckvlkslgvltpewkhhilkyfeglkn 
hdsnpdkktfpiliklliealtgksslpktpstkekmqaalfiasscktckptwgevit 
aslnrlys i anegdnqlliwvqefkerelms iqdgddaeeyrfaaqqhgeryteaieqvl 
/rnesaaklowhvintmkffhgknlglvtehi^dtlgaltlrqttvot 
flnkylnsgnqlvnswksmqkadpetkalirefaldilyaslrlpc/tsahtewstllm 
dpetyepnkac iayllyvlki i el 

CPn_0965 1106769 1104925 

lpxB-Lipid A Disaccharide Synthase 

KGFSFSKVGL^IPSGLVYLLYPLGFLASLFFGSAFSIQWWLSKKRKEVYAPRSFWILSS 

IGATLMIVHGTIQSQFPVTVLHVINLIIYLRNLNITSSRPISFRAT^ 

FLYVNMEWMASPNIFHLPLPPAQLSWHLIGCLGLAIFSGRFLIQWFYIESNNTKDFPLLF 

WKIGLLGGLLALVYF I RIGDPINILCYGCGLFPS IANLRLFYKEQRSTPYLDTHCFLSAG 

EASGDILGGKLI0SIKSLYPNIRFWGVGGPAMRQEX3LQPILNMEEFQVSGFAEVLGSLFR 

LYRNYRKILKTILKHKPATLIFIDFPDFHLLLIKKLRKHGYRGKI IHYVCPS IWAWRPKR 

KRILEOHLDMLLLILPFEEGLFKNTSLETVYLGHPLVEEISDYKEOASWKEKFLNSDRPI 

VAAFPGSRRGDISRNLRIQVQAFLNSSLSOTHQFWSSSSAKYDEI I EDTLKAEGCQHSQ 

1 1 PMNFRYELMRSCDCALAKCGT I VLETALNQTPTIVMCRLRPFDTFLAKYI FKILLPAY 

SLPNIIMNSVIFPEFIGGKKDFHPEEIATALDLLNQHGSKEKQKEDCRKLCKV>1TTGQIA 

SEEFLKRIFDTLPAV 

CPn_0966 1108055 1106748 

pcnB_2-PolyA Polymerase 

LLITI IMVCENNILSGRGLELLKKKSNITLTPTIYSVSNHNIKLKOFSPHALSVIKTLRK 
AGYIAYTVGGCIRDLLLNTTPKDFDI3TSAKPEEIKAIFKNCILVGKRFRLAHIRFSK0I 
IEVSTFRSGSTDEDVLITKDNLWGTPEEDVLRRDFTINGLFYDPEHEEI IDYTGGVNDLR 
NRYLRTIGDPFTRFKQDFVRMLRLLKILSRSPFTVETQTQEALIACROELIKSSQARVFE 
ELIKMLNSGAAKNFFQLLIENHLLEILFPYMDKAFRLNRALEEQTATYLKALDDKILKKE 
AEYDRHOLMAIFLFPLVNF^A^^YKHOKHPYLSLTSVFDYIKNFLEQFFADSFTSCSKKNF 
I LTAL 1 LQMQYRLTPL I PTKKALFFNKKLLHHTRFLEALS LLE I RS I VY PKLDKVYVAW I 
RHHQTLKCKKDSHSOK 

CPn_0967 LL0H4 31 L lQ*9H r j 

mrr.A/ pgm - Pho^pfwvi I ucomut -ir.e 

FTAYKFAF ICACR^EK TRR IG I DF R R NMQCS V R K L FGT UN RG RAN [■' E PMTV-' ETTVLLG K 
AVARVLREGR:'GKHRV\'\V';KDTRL:;r,YMFFJ^At.rAGL.M:/M''j [ETI.VLjGP [ PTPGVAFITR 
AYRADAG IM r :.*A::HNPYRDNi; t K IFnLEGFK I rJDVLEQP I ETMV.SEADR JPLPEDHAVGK 
NKRVrDAMfJHWEKVKATFPKGPTI.Kf;LKIVLt>7AIK;A:;YKVAL':;VFEF.r.DAFA'ICYGCE 
PTG IN r NEJ H-'i IAL.P" I 1 0KAV [ Kl ffjAI I LC, I M.V IDG DP I r MVDEKt '1 1 1 VFH *VM I L3 [CA 
GDI J KKR:;ALP!INkVVA , riM1'NF^VI,KYrJ-:tUA;i///FTr:pVGDRIIVtJlAMhi:HEVn , UViE0 
SGMM [ FLDYNTTi IU : I V:'AI jijVt ,R IM r F.: :K::M[ ..^DLTAf' I VK::iX?P1 - f NVAVRRK I PLET 
[PLrERTIJ<DVOnAU:r;MR[I.f.HY::( ;TENii:K7MVRf;HKKHnVI>M,AKAl.ADVinAELG 
'I'GiJRE , . „ 

CPtl_l)'lMH I 10'iHH'i I I I \ !.' ] 

<1 Lin:! -tUncD^.iiiiiii.' l-'t \u-fy-.,- i' .\iiu n< >t. r -jar; t \- t'.v.:* ; 
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DRMCG r FGY LGNQDCV3 I VL EG LAKL EYRGYD3 AC LAA^^^^k.F I RKTVGR VQ ELSNL F 
OERE t FTA3VrCHTRWATHGVPTEINAHPHVDEGR3CA^^B I ENFKELRRELTAQGI 
3FA3DTD3EI [VQLFSLYYQESQDLVFSFCCTLAGLRCS^B^IHKDHPHTILCASQES 
PL T LCLCKEETFI ASDSRAFFKYTRHSQALASGEFA ivscgkepevynlelkkihkdvrq 
TTCSEDAiJDKSGYGYYMLKE IYDQPEVLEGL IOKHMDEEGH r LSEFL3DVP I KSFKEITI 
VACG33 YHAC Y LAKY I IESLVSTPVHI EVASEFRYR R PY IGKDTLG I L 1 3QSG ETADTLA 
ALKELRRRNIAYLLGICNVPESAIALGVDHCLFLEAGVE IGVATTKAFTSQLLLLVFLGL 
KLANV>ICALTHAEXT3Fnor;LOSLPDLCOKLLArJE3LHSWAOPYSYEDKFLFLGRRLMYP 
' 77MF.AA 1 .Kl K I AY i KAI JAY E ' J' Wti 1 1' It- [ A!.. \ .'Y' Y"\ V ', AP : JCL) 1 77 f-JKM L> INMMEVK 

■M'iiAiiv i a i An:. :i'»Lr i aav.h.v. ?fvpv /../...■■vi/rr :-/^:;vmayama; .ak« ;me ;DC 

PRNLAKSVTVE 

CPn_0969 1111803 1112999 

tyrP_l -Tyrosine Transporc_l 

VYVMSN KVLGG S LL I AGSA IGAGVLA VPVLT AKGG FF PATFL Y IVSWLFSMASGLC LLEV 
^m^ESKNPV^mSMA£SrLGHVGKISICLVYLFLFYSIXIAYFCECGNILCRVFNCON 
LG I SWI RHLGPLGFAI LMGP I IMAGTKVI DYCNRFFMFGLTVAFG I FCALGFLK IQPSFL 
VRSSWLTT INAF PVFFLAFGFQSI I PTLYYYMDKXVGDVKKAILIGTLI PLVLYVLWEW 
VLGAVS L P I L5QAX IGGYT AVEALKQAHR SWAFYI AGELFGFFALVSS FVGVALGVMDFL 
ADGLKWNKKSHPFSIFFLTFIIPLAWAVCYPErVLTCLKYAGGFGAAVI IGVFPTLIVWK 
GR YGKQHHREKQLVPGGKFALFLMFLL IVIMWS IYHEL 

CPn_0970 1113452 1114648 

tyrP_2 -Tyrosine Transport_2 

VYVM SN KVLGG SLLIAGSAI GAGVLA VPVLT AKGG FF PAT FLY IVSWLFSMASGLC LLEV 
MTWMKESKNPVNMLSMAES I LGHVGK I S ICLVYLFLFYSLL I AYFCEGGNILCRVFNCQN 
I^ISWIRHLGPLGFAIL^K3PII^^AGTKVIDYC^mFmFGLTVAFGIFCALGFLKIQPSFL 
VR SSWLTT I NAFPVFFLAFGFQS 1 1 PTLYYYMDKKVGDVKKAILIGTLI PLVLYVLWEVV 
VLGAVSLPILSQAKIGGYTAVEAIJCQAHRSWAFYIAGELFTjFFALVSSFVG^ 
ADGLKWNKKSHPFSIFFLTFI I PLAWAVCYPE IVLTCLKYAGGFGAAVI IGVFPTLIVWK 
GRYGKQHHREKQLVPGGKFALFLMFLLIVINVVS IYHEL 

CPn_0971 1114693 1115415 

yccA-Transport Permease 

EGSMGLYDRDYIQDSRVCGTFASRVYGWMTAGLrVTSCVA 

FATLGVSFF INSK ICTLSVSAVGGLFIXYSTLEGKFFGTU-PVYAAQYGGGVrWAAFGSA 
ALVFGLAAVYGAFTKSDLTKISKIhfTFALIGLIXVTLVTAWSMFVSMPLIYlXICYLGL 
VI FVGLTAADAQAI RR I SST IGDNNTLSYKLSLMFALWWCNNTCMVFWYLLQ IFSSSGNR 
D 

CPn-0972 1116377 1115430 

ft&£Cell Division Protein FtsY 
RCI^SLLFPSYLVSFLJJ^LTLLLAMFKFFKNKI^SLFKKNISU3 
GTEirfEELCARLRJCTKKADASTIKDLITVLLRESLEGLPSQ 

SG KTTT AAKLAHYYKE RS E SVML VAT DT F RAAGMDQ ARL W AN ELGCG FV SGQ PGG D AAA I 
AFfiSlQSAIARGYSRVI IETTSGRUfVHGNUiKEI^KWSVCGKALEGAPHEIFWrVDS^ 

gntjja 1 eqvrvfhdwplsgli ftkvdgsakggtlfq iakrlki ptkf igygeslkdlnef 
dlocflnklfpeveki 

CPip0973 1116346 1117527 

"suteG-Succinyl-CoA Synthetase, Beta" 

EGKSKELFMHLHEYQAKDLLASYDVP I PPYWWSSEEEGELLITKSGLDSAV 
GRGKtfGGVIVAKSSAG IU5AVAKI^MHFTSNOTADGFLPVEKVLI SPLVAIQREYYVAV\ 1 
IMqR^RCPVU^SKAGGMDIEEVAHSSPEQILTLPLTSYGHIYSYQLRQATKFMEWEGE\ 
VMHQGVQL I KKLAKCFYENDVS LLE I NPL VLTLEG ELLVLDS K IT I DDN ALYRH PNLEVL 
YDPSQENVRDVLAKQ IGLS Y I ALSGN IGC IVNGAGLAMSTLDILKLHGGNAANFLDVGGG / 
AS0KQIOEAVSLVLSDESVXVLFINIFGGIMDCSWASGLVA\^^RDQWPTVIRLTC^ 
NVBLfiKE IVQQSG I PCQFVS SMEEGARRAVELSM 

CPnLQ974 1117523 1118422 

•surD-Succinyl-CoA Synthetase, Alpha" 

VCR&RRYMFHSLSKNTPI ITQG ITGKAGSFHTECCLAYGTNFVGGVTPGKGGTLWL6LPV 
YDSyLEAKQ ATGCRATMI FVPPPYAAEAI LEAEEAG I EL I VC ITEG I PVRDMLEVARVMD 
. NSTSOLIGPNCPGI IKPGECKIGIMPGYIHLPGNIGWSRSGTLTYEAVWQLTQEfklGQS 
ICVGJGGDPLTCTSFIDVUJALEEDPYTELII^IIGEIGGSAEEEAAAWIOA^ 
I AG\fTAPKGKRMGHAGAI I SGNSGDAKSK IQVLRESGVTWES PAH IGKTVDAVLRAKEL 

CPniLE^75 1119038 1119637 

No robust homolog present in Genebank/EMBL as of l//7/98 
G I EEQV ALS IAIKILKII LAL I L F P L VLLAWV I R YQ LHANFHCSW P F POT SVNQ A YKC S 
EAKIEEMLDLLDLETLEWSSRCLRQDffTFANRLEEELIQEUlVSETEELuSLGGKRNLVR 
LLLTH FFNP PKRSRVESVGHEWFPVFDRLKREEE I IGDG P I TRSNEE/WALLDHGTARG 
IHKTLWFSIFFKYLTQIELF 

CPn_0976 1120079 1121185 

No robust homolog present in Genebank/EMBL as/o£ 11/7/98 
I LMLVYCFDPSVPTSPEHRLMAALDRWFFLCGHRAR I LTLEGNHffRAFQENMS I STVEK I 
LKLrSYLLIPrVLIALLIRCFLHSRFKCNWKCDSLSDARVPHDyQPFNDFQLFNNQERLN 
IWKNRRYVSGIDVLMVPVDYLRSQFPGFKEIPEAIRCEhTYVSriGOFSEESKTSYLRAMLT 
D I VGY I LSLDETYWTNVI LKIRAMC ITFESFPCKEADPNYS SRVTHHYFDESWKALARHV 
LG ECNMVNR LDEALI RTEK PGKEGEC ITKQFLKDYCKKHLEVMSCPDF I ESLVDEK IREF 
RCPS I LNSAVCDVI DRKCQEHLLKAI INEANRRLFGMKNS£FTMRGNQVLFYT I FS PPKL 
PPAASSVYF 

CPn_0977 1121329 U23402 

No -robust homolog present in Genebank/EMBL as of 1 1/7/98 
LY I NQ FAN I LKSS FLMEVYSFS PSVP.TS FOHRVM.^ALDNWFFLGGRRLK WS LDSCNSG0 
ACEE'A/PISTTEKVLKILSYLLIPIVriALLIRYLZHSNFTAKVSOKPWLKTLQLGIDIK 
:>TILPGSMVNTMD!3ATLFKAIRLBGKRVDVEYHRmsSDKWFYIPAQKLPDDLRLTHWL 
PEKETRKTEYVRHMLJVHVMGYLTSC^jKERLOOVWDSRS3T5LGAEKVL0YRFIDHP0S0 
GEFOKLLNEN ITTKGSEDKEWQSDLFDMAFQGWWPQFISV rQSPTFSEELVHEMSQKLD 
LEX:[YPEDDEFE0KFLNTLLKAVLHHGFECI3]?Ai?Mr;VrFLICPDSLAL0IPFLRN0K 

t:t J n_fj'*7H 112265* li://V< 

Nn rfjbu:;r. honiiilutj pr«;:;ent in (/:net\mk/EMPL iv, ot ll/7/'.»H 
KYI-TMEW:jl'MPAVRTr;FU!!RV^LDA^FUk'flKLKW::LD:;CN3^AY0ELV3I3TT 
KKVI.KLL:;YI.LVt>[Vr tALLrRCLLH2N/R[DVEKERWI.K[RELCrDIESCKLP.*;3YVN0 
V: ! S F [ W F F.K DK : J K R P R T DV DYHT LH 3 KmfW F P t V FQ K [ ['KT3RF3YWF3QKETRKRDYV 
HNMI.UIIVtf;YLT:;B:f;LM 1 UYI3KT:;703AT.';LrrER7LOYi:LTDNOELOGEVORLLNEE 
:;ATK3f:fJDKKVLrj:nVflDr rCQCWWn«FLEVI0:TAFrFXLVEEV3OKLNLDFLCLEKAN 

■n,i>jKf.RN::r.(,i'Awnii(::;w:vDrfCKvr,AGLi [YTEAroLOtPF.'JRS 
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CPnJ)')79 LlH^T 1125/143 

htrA-DO Serine Protease 
GIDMITKQLRSWLA\/LVC33LLALPL3G0X , ^KKESRVSELP0DVLLKEISGGFSKVATK 
ATPAVWIE3FPKS0AVTHP3PGRRGPYSNPFDYFNDEFFNRFFGLPS0REKPQSKEAVR 
CTGFLVSPDGYIVTNNHWEDTGKIHV^HDCQKYPATVICLD^ 

LS FGN5DHLKVCDWA I A IGN P FGLQAt/tVGV I SAKGRNQLH I AOF EOF IQTDAA I NPGN 
3GGPLLNTDGQV IGVNTA rV3G3CGY/c TCFA I P3LMANR t t DQL I RDGOVTRCFLGVTL 

'.'!•; .' . aet AA' .'YK ; .efvy- ;a:. tt: v// :. :;'al-*a- :; v : ; ay:.» :;:;-:v:;: ;; .. :mfrnav.; 
i.Mrrr -tr ivr.Fv/RKJirvrF.f rrr/ \ \vy?.:,- :m.'ai.; . :- - ;;'-/"? rrrA :a; f. 

TKG I L I I S VEPGSVAA3SG I APGtyL I LAVNRQKVS3 I EDLNRTLKDSNNEN I LLMVSQGD 
VIRFIALKPEE 

CPn_0980 1126^88 1125504 

•similarity to Saccharomyces serevisiae hypothetical 52.9KD 
protein 

fvmlnh akkhakpyvl i ff/tkdklsycd 1 1 fnncsg k pmnldskh fd i nsanfleefak 
fi s fps isadsdhlqdcefijcahflvdhvnk i fdvelwet pgh p p 1 1 yas yks edpls ptl 
mlynhyxmjpaqlsdgwmdpfiu^ee^^yargasdnkgcktfytlkai^hyyesc^ 
pl^iiwliegeeesgslAlftwlekkkealradyllivdggflsekhpyvsigargivsm 
ki sleegnkdmhsgvlcgi ayntnralseilsslhhpdns iai egfyddlalpsdsdrpd 
lpksdtlreceenlg/rpqgyeasys peesalrptve ing 1 3ggytg pg fktvi pyrata 
ylscrlvpnqdpdkjfthqv ihhlkqqvpss lkfsye i lpggs rgwrss anlp ivkvlqe i 
ysdlyneeclrlvmpati pigpllgeaaqts p i icgtsylsdd i haaeehfsmoqlkkgf 
lsicqlldklpklke 

CPn_0981 / 1127019 1129952 

Zinc Metallfcprotease (insulinase family) 

VTESMKAGOTYRNFI I KSCKDLPE I ESKLLEAEHKPTGAS I MMI VNNDEENVFNICFRTC 

potsm^ah^hmvix:gsenyfvrdpffsmtrrslntf INAFTGPDFTCYPAASQ I ped 

FYNLLSVY/DAVFHPLLTKQSFI£EAWRYEFNSEW 

ALNAAIF/sVTYGVNSGGEPREIVTLSHEDVRAFHQSQYSIhmCLFYFTGNIKPSRHLDF 
LEEKLL9QATKLEKQAVSVPLQKRFKE PVRNI LTYPVDHQEEDKVLFG I SWLTC S I LEQQ 
EIILMOTDASPLKSRLLKSGFCKQTEMSIENDIREIPMTLVCKGCSPAGAQK 
LEALtf ASLEEIIRfCISENrATBGAVHQLELSRKEITGYSLPYGI^LFFRSGLLKQHGGS 
A£IX^IHSLFSEIJWSLJ<NSDYIJUaiRKYFI^NPHFARVII^PDTELVAKC^ 
LLSySEKXTDENK^IC^NVRELTESQEQKEDLNGILPNIALDKVTTSG 
CC8VIiiHECFTNDIWIDWII>IPPLSGEELPWIJ^ 

IGGVDVSYDFSPHANKNSFl^PSVSIRGKALSSK5EKLCGIVSDMLTSVDFTDIPRIRE 
MQHNEALTNSVRNS PMS YAVSMACSGNS ITGAMS YLTTGL PYVKK I RELT KNFDQNI D 
AWI LQRLYTKCFSGKRQ I VI SGSAHNYQQLKDNKFYGLLDYLI VI PE PWENPS I NLYV 
TSRGLH I PARAAFNALAFP IGD IAYDH PDAAALTVAAE I LDNWLHTK I REQGGAYGSGA 
' AANI^RGSFYCYSYRDPEIATTYKTFUC3VSEIASGNFTKEDIYEGALGVVCC 
GS RASVAFYRLKSGR I PVLRQAFRRSVLEVTKEH I CMVMDKYLESTVQETTL I S FAG EEM 
LRNNVLTLDKDFPIVPAI 

CPn_0982 1131215 1129962 

yigN family 

KKELASVMNLPVSLACLIXSGCVFFLGVFVSSSLYARKKRA^ 

NLSRHQEQL I ED FSNRLALSS HKL I KI»1KEEAQNYFGDTS KS FQSI LS P ICTTLTTFKQS 
\ LETFETKHAEDRGFlIJ<EQISOLLAVEKKLEHETHVLTDIU<HrcSRGRWGEIOLERIL^ 
| AGMLKYCDYDSQTTSAQGAFRADI I IRLPQDRCLI IDAKAPISDSYFSVEEIDKGDLVDK 
I IKEHIKTLKSKSYWEKFHQSPEYVILFLPGESLFNDAIRLAPELME IGASSNVILSSPLT 
f LIjALLKTIAYMWQENIjQKQIQEVSLLCKELHRJU>QVVFTHFOKIGK^^ 

SSFQYRVLPTLRKFEGLETSSSHQI EEPTPIESLATSFPHTCDIDTNLAVI ESLEKQD 

CPn_0983 1132045 1131206 

pssA-Glycerol -Serine Phosphatidyl transferase 

KNPLCYEQKKLWQIDMAGLDLEARGKRRVVTPNAITAFGLCCGLFIIFKSVLRTSSSVEL 
FHRLOGLSLLLISAMIADFSDGAIARIMKAESAFGAQFDSLSDAVTFGIAPPLIAIKSLD 
GI YVGNFFSSLLLITS I IYSLCGVLRLVRYNLFSQKTVDVSKPYCF IGLP I PAAAAS IVS 
LALFLASDFFPDLPAQLRVGLLSFALLFIGGLWISPWKFPGVKHFRFNVSSFLLVVTIGL 
AACLFFSGLVDHFVEVFFLVSWLYTLVGFPIFSI IYRKKS 

CPn_0984 1132370 1135510 

"nrdA-Ribonucleoside Reductase, Large Chain" 

GKVMVEVEEKHYTIVKRWGMFVPFNQDRIFOALEAAFRDTRSLETSSPLPKDLEESIAQI 
THKWKEVLAK I SEGQWTVER IQDLVESQLY I SGLQDVARDY I VYRDQRKAERGNSSS I 
IAIIRRDGGSAKFNPMKISAALEK^RATLQINGMTPPATLSEINDLTLRIVEDVLSLHG 
EEAINLEE IQD I\ZEKQLMVAGYYDVAKhTY I LYREARAR 

OKEDGTTYLLRKTDLEKRFSWACKRFPKTTDSQLLADMAFMNLYSG IKEDEVTTAC IMAA 
RANI EREPDYAF I AAELLTSSLYEETLGCSSQDPNLSE I HKKHFKEY I LNCEEYRLNPQL 
KD YDLDALS EVLDLS RDC£FSYMGVQNLYDR YFNLH EGRRLETAQ I FWMR VSMG LALNEG 
EQKNFWAITFYNLLSTFRYTPATPTLFNSGMRHSQLSSCYLSTVKDDLSHIYKVISDNAL 
LS KWAGG IG NDWT DVR ATG AV I KGTNG KSQGV I PF I KVANDTA I A VNCXjG KR KG AMCVY L 
ENWHLDYEDFLELRKNTGDERRRTHDINTASWIPDLFFKP.LEKKCMWTLFSPDDVPGLHE 
AYGLEFEKLYEEYERKVESCEIRLYKKVEAEVLWRKMLSMLYETGHPWITFKDPSNIRSN 
QDHVGWRCSNLCTEILLNCSE3ETAVCNLG3INLVEHIRNDKLDEEKLKETISIAIRIL 
DNVIDLNFYPTPEAK0ANLTHRAVGLGVMGF0DVLYELNI3YAS0EAVEFSDECSEI IAY 
YA ILA33LLAKERCTYASYSG3KWDRGYLPLDTI ELLKETRGEHNVLVDT3SKKDWTPVR 
DT rOKYCMRNSOVMAI APT AT I SN I IGVTQS I EPMYKHLr/KSNLSG EFT I PNTYLI KKL 
KELGLWDAEMLDDLKYFDG3LLEIERI PNHLKKLFLTAFEIEPEWI IECTSRRQKWIDMG 
VSLNLYLAEPDGKKLSNMYLTAWKKGLKTTYYLR3QAAT3VEKSFIDtNKRGI0PRWMKW 
KSAST3 T WE R KTT PVC SMEEGC ESCQ 

CPn_09!35 1135432 U1h57l 

"nrdB-RibonucLeor>idf? Re(iucr..ir>e, 1 1 ch.uin" 

tSVHKYCGRKKNNPRLFNGRRLR ILS ITEKRGAKMEADI LI/IKLKRVEV^KKGLVNCNQV 
DVNOLVP [ K YKWAWEH Y LNGCANNWL PTEV PMARD I ELWKf:&EL:;EDERRV I LLNLGFFS 
TAE:;LVrjNNTVLA[FKIirTNPEAROYLLI<OAFEEAVIITHTFLY[i':E:;u;LDEGEVFNAYN 
FRA.'J I R AKDDFCMTLTVDVI ,DPMF3V0r;r; R( II i 'tjV [ KNL.'A JYY [ [ MRU FF Y3f JFVM I L3 
FMRONKfW I ICEQ YfjY [ I ,R DFT [ ! ) LNF f ; I Dl . 1 N> 1 1 KEFJIWJWn'Kl <OFE I VAL I EKAVE 

[.EiEYAKni;LPRf;ru.aji::::MFrDYVRinA[iiikiJ-:H[t;irrMYii::iwpKivw:;FTMDLNK 

FKN F F FT R VT E YfjT Ai :NI.::w 

'■fn_U'iH*. 1 I -.(.■/ I [I i7 

yiMll pr.ni let <:<l r UNA M. r r I iy I , c ■ 

(•MMJ- , MKr^Di,::ppi-M,WKKi<i^vMvi':vr/('vri<iiY!- , KiiuNF:; , r:;YtiuKF 

: v 'M ii^/vaoaukui vvi.w I avkok i'i jcvrk I w: :km t ni to t'Jt-n A< I Vi "i ;TAirn- l-'UYYVP 

u.vni[^r'RMFT[ i YY[KMTm'Y';n:;wi-i:ni.wi."! , K' ^kifytefikkaci 
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CPn_0'»H7 1117491 U3HU5 

ytqB-ltke predicted rRNA methylase 




LENC r FA rGFFM FAYRTLLTJ INWQV3H E I FKTTWPCDTVT DATCCNGNDS LFLARLLQ 
GEGRLVAT^DrOKEALSNALLLFETHLSEQERSVIEWKEOSHEHILEKDVKLIHYNLGYLP 
KCNKE ITTLARTTEI3LEYALN IVRPDGL ITWCYPGHPEGEKETHSVE3LAQRLH PKEW 
CV3.1FYVANRCRAPRLFIF0R0CSE5SVDKC 

::ii;?>-. I'll' :,' y ! ■< , i ( f ' 1 V •*/ I ' J !• I' Ill 1 n. !'■ ' I' !" T . f. • 

KFFINLINLDQGILKMKEAAPMHFPFPVRRSVWLNRYSTFRIGGPANYFKAIHTIEEARE 
VIRFLHSINYPFLI IGKGSNCLFDDRGFDGFVLYNAIYGKQFLEDARIKAYSGLSFAALG 
KATAYNGYSGLEFAAGIPGSVGGAIFHNAGTNESDISSVVRNVETINSEGELCSYSVEEL 
ELS YRSSRFHRQQEF ILSATFQLSKKQVS ADHSKS I LOHRLMTOPYTQPSAGC I FRNPEG 
TSAGKLI DAAGLKGLAIGGAQ I SPLHANF I INTGKATS DEVKQL I A I IQSTLKTQC IDLE 
HEIRIIPYQPKIHSPVSEK 

CPn_0989 1139552 1139016 

CT832 hypothetical protein 

LRT S LA VKCVLLT I FWLL VMAT LSPEKFSGSPISISKEF PQQKMRE 1 1 LQML YALDMAP S 
AEDSLVPLLMSQTAVSQKHVLVALNQTKS ILEKSQELDLI IGNALKNKSFDSLDLVEKNV 
LRLTLFEHFYSPPINKAILIAEAIRLVKKFSYSEACPFIQAILNDIFTDSSLNENSLSI . 

CPn_0990 1139880 1140440 

infC-Initiation Factor 3 

SVALNFK INRQ I RAPKVRL IGSAGEQLG I LAI KDALDLAREAGLDLVEVASNSEPPVCK I 
MDYGKY RYG LTKKEKDSKKAQHQVR I KEVKLK PN I DENDFSTKLKQ ARTFVEKGNKVK I T 
CMFRGREUVYPEHGFKWQKMSC^LEDIGFVEAEPKLAGRSLICWAP<7rVKTKKKQEKS 
HAQDENQ 

CPn_0991 1140394 1140612 

rl35-L35 Ribosomal Protein 
KQRK^KSLMPKMKTNKSVSARFKLTASGQL 
KGQ VGMY KRMMLV 

CPn_0992 1140622 1140996 

rl20-L20 Ribosomal Protein 

GKLVMVRATGSVASRRRRKRILKQAKGFWGDRKGH I RQS RS SVMRAMAFNYMHRKDRKGD 
FRSLWIARIJvIVASRIHSLSYSRLI^UCCANISLNRKMLSEIAIHNPEGFAEIANQAKKA 
LEATV 

Cfft-0993 1140975 1142030 

"phsiS-Phenylalanyl tRNA Synthetase, Alpha - 

KSE^HSIjGIRISMEMKEEIEAVKQQFHSELDQVNSSQALADLKVRYLGKKGIFRSFSEK 
LKfjG^DKAKLGSLINDFKTYVEDLLQEKSLVIJJ^ 

HI I3CS I LDDWD I FVHLGFCVREAPN I ES EANNFTLLNFT EDH P ARQMH DT FYLNATTVL 
RTHTSNVQ ARELKKQQ P P I KWAPGLCFRNED I S ARS HVLFHQVEAFYVDHNVTFS DLT A 
ILSAi^HSFFQRKTELRFRHSYFPFVEPGIE^WSCECCGKGCAIXZKHTGV^EVAGAGMI 
HP^RIJGNVDPEIYSGYAVGMGIERI^MUCYGVSDIRLFSENDLRFLQQFS 

CPn3>994 1142371 1144440 / 

hypothetical protein / 

lfwehrggrmkrspjwfeqalenleklkeisi^ts^syi^par™orj<otgssvme^ 

EAUflMVENYLLE I SCVSKSHADKALKESDFLIAGVQNVFSFLENQEDLYKSLLDEYSHVT 

kaybevkknlkevptydlstdeeteehkepecfu^vevkrdrs^ 
ntoftlvqiiykqnklhe^wegdpltktl^wnseev^ 

ld i eawkvhnavmalffsryeatmvfks pkkhniwyftjdfllftjieawkdujn^'idsq 
erkqtkllasalslgifesklvfeeasrylyfnigtklenangkkplspgqyl/dayeel 
hruskypngplfkamdrvlehesrpydpmilgilpslegtlklhgks idi ir5pspvtq 
sstlyancneeflgflnakahrsevtlvl^iq^ispjcerarsrvieeale/eehapyvh 
afs£pepeellonlesihgdietfadffsiijqeefhkpli*assffltkeij&fvgsflke 
kl j alkd iffakkk i lfrndkllllhlls yl i vfkl i ertnpns ivwskdgldyvsvf i 
agj?^fsreafvtoehslj(llltnvlsptlvardrlvfvshiei^^ 
lk$ffkddiegweftgylheltevshkhnl / 

&4 / 
CPn-0995 1145515 1144415 / 

CT^IJ hypothetical protein / 
RMLJWKRHLLTRFWFALTSLLVLALI FYAS I HHSLHTLKGASTAASGASVKLS ILYYLAQ 
tSfeteFL^PQLVAVATTSTLFAMQNKREIILLQASGLSLKSLHHPLLLSGAVIMMVLYA 
NFQWLH PICEKISI TKENMDRGTTDKEQGK I PALYLKDQTVLLYSS I EPKTLTLNNVFWI 
KDPKTIYTMEKLAFTTLSLPIGL1WTQFFANDSENLELKEF/DMKEFPEIEFNFYENPFS 
KLFSAGM<NRLSEFFKAIPWNATGLGLSTQVPQRILSLU^FYYVLISPLACMMIILSA 
YLC LRF SRT PTVTLAYL I PLGTVN I FFVFLKAG IVLASS M, PTLPVHAFPL I VLFLLTN 
YAYAKLQ T 

CPn_0996 1146592 1145519 / 

CT839 hypothetical protein / 

AM P I LWKVL IFRYLKTAAFCTLSLIC I S 1 1 SSLQe/vAY I AKDVPYDTVLRLMAYO I PYL 
LPF rLPGSCFVSAFSLFRKLSDNNHMTFLRASGAflfQS I IMFPVLMVSGAICCLNFYTCSE 
LAS ICRYQTCKE I ANMAMTSPALLLOTLQKKENbfe I FI AVDHCAKSKFDNVI VALKCNNE 

rSHVGI iksi ipdttkdtvkakdwfisklpdsItessspssqrfyietldellipkits 

TLFAGKSYLKTRTDYLPWKOLVKQSLKHSHLBETLRRVAIGFLCITLTYAGMILGIHKPR 
FRKSIALYFIFPILDLILLIVGKKTKNLPLA^MLFVFPQLVSWWFAARAYRESRGYA 

CPn_0997 1146699 ll/lf>64 

meaJ-PP-loop super family ATParee 

AYKMVL3SDLLRDDK0LDLFFA5LDVKWRYLLALSGC3DSLFLFYLLKERGVSFTAVHID 
Hr,WRST3A0EAKELEEI>CAREGVPFV[jrTLTAEECGDKDLENQARKKRYAFLYESYRQLD 
AtJG[FL/\HHAND0AETVLKRLLESAH£TNLKAMAER3 - /VEDVLLLRPLLHIPKS3LKEAL 
DARCirJYLODPSNEDERYLRARMRKbfLFPWLEEVFGKIirTFPLLTLCEErfAEL^EYLEKO 
AyPFF3AATI!OD^:a , .ELFX:PDCLr<X>AFLCKWVMKKFF^fNACrAV:;RHFL0MV*^'DHL3R3 
:;OATLRMRNK [V [ I KPCVWID / 

t r :;H -A'f'I'Mh:pi!n(it>tlt z i/c ptotiMue 

Li:;[//yKFM:;KUKKMKrEPKj^FnVFFFLLF^WFrr/ 1 /A^NFf^t;KKARVCF:;HOrEII 
1 .VNU<L I VPEP^ItK lALND6/LV::FCit.JRFRDVQTOF.COr<RYHY[ J EL I LXX'IflRLDLDLOETG 

k:; [ .rr [/ ;kf.vtn: ; 1 1 .wf::/i ru wp r peqgya i syp.';l"/:;q :vi :r f.plwtcipatpql inl 

ll:;UM-;RYlM'L::R::PK J \[.HTYf::;DLYELIGKYL:;rVI/;r';:JCTLKRELKDLYOOVEVSL , rtJ 
^r^^/^FAAYTr^f;OVI,::J^NR[SS^LWSECy3ERF:;OLI^SVR[ > YR^^I■:WNKYMKLVF^RDLM 
MAOI.KKlJ^;RJ/:(yr\^TNNOEL^:>nSLEKODPEVF^IIWFAi^\KL:t-WTAFKFMI[::L;'FKA 



POQPRNLVLEKTFKSOEPS^^^FTFLPIILVLLFV^LVF^ROMRGMSGSAMSFCKS 
PARMLLKGONKVTFADVAG ^HEeL I E I VOFLKNfNK FTS LGGR r PKGVLL IC P PCTG 
KT L I AK AVSCEADR PFF3I ACoDFVEMFVCV3A3 R/RDMFEQAK RNAPCI IFIDEI DAVC 
RHRGAGIGGGHDEREGTLNGLLVEMDGFGTNEGVffLMAAT^PDV^ 
VMNLPDIKGRFEILff/HAKRIKLDPTVDLMAVA^TrcASGADLENLLNEAAIXAARKDR 
T A VTAVDV A EAR DKVL YGK ER R S LEMDA E ER ICTJT A Y H ESCH A WC LCVQHC D PVDKVT 1 1 
PRGLSLCATHFLPEKNKLSYWKKELYDOLAVL«XRAAEEIFLCDISSGAQODISQATKL 
\^3WCEV^MSP0LG^/TTDERS0GLTGYGOTHEKSYSEETAKTIDTELRMLLDAAY0RA 

.r : : -KAi-: : elmt-wi : .; -::Krv»E imp.! ir.vrr K-rAi'-Lriiiv ;mlf kkjjcdl 
i^t-'m-kep-:':.;^:.:"::^: / 

CPn_0999 1152859 /l50766 

pnp- Polyribonucleotide Nucleotidyltransferase 

QETFWJFQTI3INLTEGKILVFETG/lARQANGAVLVRSGETCVFASACAV r DLDDK\TDFL 

PLRWYQESCFSSTGCTLGGFIKREOTPSEKEILVSRLIDRSLRPSFPYRLMQDVQVLSYV 

WSYDGQVLPDPLAICAASAALA 1 3DI PQSNIVAGVR IGC IDNQWVINPTKTELASSTLDL 

VLAGTENAILMIEGHCDFFTEEQ'v'LDAIEFGHKH IlTICKRLOLVs«2EEVGKSKNLSAVYP 

L PAEVLTAVKEC AQDKFTELFNT KDKKVH AAT AH E I SEN I LE KLQREDDDLFS SFN I KAA 

CKTLKSDTMRALIRBREIRA^RSLTTVRPITIETSYLPRTHGSCLFTRGETQTLAVCTL 

GSEAMAQRYEDLNGEGLSKRYLQYFFPPFSVGEVGRIGSPGRREIGHGKLAEKALSHALP 

DS ATFPYTI RI ESNIT ESNGSS SMASVCGGCLALMDAGVP I S S P I AG I AMGLI LDDQGA I 

ILSDISGLEDHI^DMDFKttAGSGKGITAFQMDIKVEGITPAIMKKALSOAKOGCNDILNI 

MN EALS APKADLSQYABft IETMQIKPTKIASVIGPGGKQIROIIE ETGVQ I DVNDLGWS 

I S AS5A5 A I NKAKE I LEG L VG EVEVGKTYRGR VTSW AFG AFVEVL PG K EG LCH I S EC S R 

QR I ENI S DWKEG DLf DVKLLS INEKGQLKLSHKATLE 

CPn_1000 / 1153193 1152891 
rsl5-S15 Ribosomal Protein 

SAFAAI ILRRH/MSLDKGTKEEITKKFQLHEKDTGSADVQIAILTEH I AELKEHLKRSPK 
TONSRLAIX^VGQRRKLLEYLNSTDT ER YKNL I T RLNLRK 

CPn_100l/ 1153369 1153869 

yfhC-cycosine deaminase 

YYLEI^EKLINMEKDIFFMXAPKEARKAYDQDEVPVGCVIVKDDKI IARAHNSVEKLK 
DATAHAEILC IGSAAQDLDhMRIIJ7rVLYCTLEPCLMCAGAIQLAR I PR IVWAAPDVRLG 
AGG3WNIFTEEHPFHWSCTGGVCSEEAEHLMKKFFVEKRREKSEK 

C>n_1002 1153844 1154089 

irr845 hypothetical protein 
/ KS AERKVKNK IVTLLDQLYEDQESRLQKLGEE IVPNLTP EDLLQ PMDFPQLEGNPAFRF E 
f EGVLSGIGEVRAAILAALSQEN 

1>fn_1003 1154862 1154092 

CT846 hypothetical protein 

T^nCTIHPLLWGPDROIAGKASMRVIFPDKHNNFPNI^KLLKKLPSVILVTSCIAPFFSY 
MNKFFGIPGLLEILALSVKGIQKHHF^FLTYPLITADSLSIJJKDQSFEITO^ 
^FFTjFYKAIQHLIRKLGAFSVLWISGOALIIGAVLWGFMALIHSSQSFFGPESIICGV 
VLTVQ I FLDPEKRFT IGPTPLSVSI KWGFLFVLGFYCC I LI FSGAFLLLLASMLAIVLAI L 
/ FCKKEKI PNPYTTSLRF 

CPn_1004 1155418 1154879 

CT847 hypothetical protein 

HLS I EELMS IQ PVSNTTTKADKV I PDSTKV IS DS IT INKQS AFYFC I SVMLRLSESTTEY 
GKS ILAVLEDNT I VQOQRVKEL INLPLLKVPDLQKKDGS DDEYKNQNEIQAYQSSNQQ I S 
ANRQMIQQELSSAQQRA0ANQKSVNSTTIESM0ILOATSSMLSTLKELTIKANLTNSPSD 

CPn_1005 1155957 1155415 

CT848 hypothetical protein 

NRKPVRLNMWI IDPLSAKKPLQAAINVPGTPITGGPNTATADDI IAKFSKDSNPLIVTVY 
YVYQSVLVAQDNLS 1 1 AOELQANSSACTYLNNOEALYQYVS I PKNKLNDNSSSYLQNIQS 
DNQAIGASRQAIQNQISSLGNAAQVISSNL^n'NNNIIQQSLQVGOALIQTFSOIVSLIAN 

I 

CPn_1006 1156493 1155990 

CT849 hypothetical protein 

TKVNFF IMS ITTLGTLPTVNTINSSRPPLEPLNTPK IGAVLFS I YELLLQAI EIRQQTVL 
TQSQQL^NTNICXMLNQEri^IKYAIVSAGAKEDEITRVQNONQNYSAQRSNIODELVT 
TRQNGQIILSHASTNINIIG^SSQDSSFIKTTNSIGSTVNQLNKPLG 

CPn_1007 1156689 1156907 

CT849.1 hypothetical protein 

LWYKSLAGEEKDVSGNECNDYPEVFKDDVSAYVLVTCGOMSSEGK IQVEMTYEGDPAVI S 
YLLTKARDSLDES 

CPn_1008 1156904 1158223 

CT84850 hypothetical protein 

VLNYSF IGMLKPMYVLSKRLYRWVNQLI KLGDLVKNSRSFSVEWVF I SALLL IFGCLGCA 
SWKVSLVPFLLLFSFLAFrLILCFRGKCYALLLGVr/rLYVAKYWGETLYVSFWLSGL 
GVSFLLAFCLFLOGVWLAOEEEMVKGKEQLRLSEDLDAORSAYEDLLLTKSQEKEFLDAR 
AOGLDRELTEC0ELLK.WCKQEYLT I DLKI LADOWJ^WLEDYAELMNKY r ELVSKNGDV 
VFPWVAEPSVGESQGSERVDVSRWVSALQEKEESLERLRNEILVEKORCSDYEHRCQELG 
LLLQNFTALERRCEELONLLN0KETOINELHOLVCK3EEKVSVEPSAHAETSCVEEKQYK 
GLYSQLOEOFLEKSETLSLVRKKLFAVQEKYLTLKKKEELTKQDISFDDISMIQGLLERI 
EILEEE 1 /3HLEELVSRSL3L 

CPn_100 r J U53085 L 158 186 

mop -M^nhion ine .Aminopept idase 

YRLLIIRVILMKRNDPCWCGSvlRKWKOCHYPOPPKMSPEALKOHYASOYNILLKTPEQKAK 
[ YNACO I TAR I LDEU'KASCKGVTTNELDELSQELHKKYDA I AAPFHYGSPPFPKT ICTS 

t jitv f:ir; i tno r plkqjd i mn i dv::c r vccyycktspmvm tgevpe i kkk tcoaalecl 

fJD:; [A I L.K TO [ P t ,( .'E U1EA [ EARADTYf IFrJWDQF'Vjl KJV» f EFHENPYVPHYRNRSMIP 
I ^1" IM f FT 1 ErM f NVi 'KKEt. '.WDPKNf.WEAR'PC Dltfj P'JA^iWEl tT I A [TETGYE ILTLLND 

f.'l*n_l ') 1 'I l!''-»'.7 r j I | r , '.»f)i, 7 

' TY'V hy[.«»i hi.-r umI f »t ot,< ; i n 

7MI . I t .1 .Nir.LI .KYVI.FP::ri \;\[\ -VKVALLKNI-'.SRKKOORV f [ ,RK( M, FALLAL t LFVTFGR 

:;i-TOKf.r,r:;LYAFO[ tiviF'Li.i-Tv::iKMMi^mPEKAKDr/pr:KTEPrFFP[AFPvrTnpA 

7 £TAM.:;YMEW; I YSKF. I t rrAM[ t AWAF:;i,FTU^~TFnRLFCNFr;LIJVUvRLFf: [AL 

iJ,M:;vrif^i,K l :t:;iAFNi.;Fvn; 

'.•l-n.l'i I 1 I 1»m) Un I I '/no:! 



119 



TTHS i I iypor.het.ic.il protein 
EHRLKN YPM Itf F3FFLPQTC [ LLLASDSLTN I LALHII 1 
AMFALYGLALtiCLKVLNTPVCA I EWGCIAVTLACVRA^ 
PCrrtPCALPLMFCPSC 



;aiBk;kei 



:qrhlvllresffafi 

KEESWrPYKFNMSPSYS 



CPn_ltU2 1162220 1160421 

yz*:Q-ADC transporter permease 

A[Fr;L[TSKMKKKFrFYFVTVF.^LLFLWEMTSRHPPTF5FFrPPP.S. , lIA".7rL0SLPLLL 

t. ■Awin-i.KA i ; ;* .a rr: .. ' i v; .at : y.i., :yk. :,\r: ; .: ! ; i .: .v t; y.--\LAi-f. :vlw 
-:.-f : :• :av : 1 " t.m.t: i-i-T'i r::.v rvv : ; :„.-rr : :a';-'j-v.: : rr-HALP 

H I FiGLKI A IGSAGFAA I AGEWVASQSGLG ILMLESRRNYEMELAFAGLATLS ILTLSLF 
QITLLIEKLIFS LFR VKRMSLKH KSVAKKALS VLAL I P I ML I PWKGNSKS P PDKKNLTS L 
TLLLDWTPNPNH I PLYAGVAKCYFK0HGLDL0UJKNTDSSSAVPHVLFEOVDMALYHALG 
IMKTSIKGMPIQIVGRLIDSSL(X;FLYRSODPIYKFEDLNGKVI£FCLNNSRDLNRLLET 
LNRNCWPS EVKNVSSDL I S PMLLNK I DFLYGAFYN I EGVKLQTLGMF/KC FLSDTCDLP 
TGPQL I VFTKKGTKASEPE I VEAFQKALQ ES I IFSKDHPEDAFKLYAKETKS I PKNLYQE 
YLQWEETFPLLAQSQDPLSKDLVDKLLETIIKRYPELiASEVAKFSLNDLYWPSLPEEOSV 

CPn_1013 1162209 1163624 

fumC-Fumarate Hydratase 

RENSLNHRGNIDMRQEKDSLGrVEVPEI)ICLYGA0TMR5R^FSVGPEl>lPYEVIRALVWl 
KKCAAQANQDI^FLDSKHCDMIVAAADEILEGGFEEHFPLKVWQTGSGTQSNMhA/NEVIA 
NfLAI RHHGGVLGSKDP I HPNDHVNKSQSSNDVFPTAMH I AAVI SLKNKL I PALDHMIRVL 
DAKVEEFRHDVKIGRTHUIDAVPMTLGQEFSGYSSQLiWCLESIAFSLAHLYELAIGATA 
VGTGLNVPEGFVEK 1 1 HYLRKETDEPF I P ASNYFSALSCHDALVDAHGSLATLACALTK I 
ATDLSFUGSGPRCGL^ELFFPENEPGSSIMPGK'yNPTCCEALQMVCAQVLGNNQTVIIGG • 
SRGNFELNVMKPVIIYNFLQSVDIXSEGMRAFSEFFVKGUCVNKARI^DNINNSLMLW 
LAPVU3YDKCSKAALKAFHESISLKEACLALGYI^EKEFDRLVVPEKMVGNH 

CPn_1014 1165456 1163732 

ychM-Sulfate Transporter 

ALASTLGYCIVKVTWAFKNFIPKLYTSIKEXrYSnTTFKKDFQAGITVGILAFPFAIAIAI 
GVGVS P IQGLLAS I IGGLLAS AMGGSNVL ISG PSSAF I S ILYCLSAKYGAEALFTVTLLA 
GVFLIAFGLTGLGTFIKY^PYPVVTGLTTGLAIIIFSSOIKDFXGLQMGANIPADFLPKW 
IAYWDHLWTWDSKSFAVGLFTLLIMIYFRNY^PRYPGVMIArVTATTLVWLLKIDIPTIG 
SRYGTL PTAI PLPK I POLS ITK ILQLMPDALT IAVLSGLETLLSAWADGMTGWRHQSNC 
QLVAQGVANIGTSLFSGI PVTGSLSRTAAS IKSEATTPI AGIVHSI FICFILLLLAPLTV 
KIPLTCI^VLILIAV^SEIHHFIHLFTAPKKDnn^TWILTVWITITMVQVa^ 
AAFLFMKQMSDLSDVI STAXYFDKDSDFLSKAEVPQNTE IYEINGPFFFGIADRLKNLLN 
D I EKPPK I F I LCMTRVPT I DASAMHALEEFFLECDRCCTI.r J.T AGVKKTPLADLKRYHLD 
ELIGVDHIFSNIKSALLFAQALTNLESKTSTRHLV 

CP|U1015 1165550 1166893 

CT"8*£7 hypothetical protein (possible IM protein) 

KNtf^FSFFTSVRVRSKVBHEIILEVTMLKI^IjCALFLFOT 

AMGfiLMWLVCFSHIPMADHMILVEEIADMSQVIFFLFSAMAIVELIDAHKGFSVIVKFCR 
IQ$RTLLLWALIGLSFFLSAALDNLTSI I I I ISILKRLVKAREDRLLLGAICVIAVNAGG 
AV^LGDVTTTMLWINNKITSWG I IRALFVPS LVCVLVAGFCGQFFLRKRGSTLI AKDVE 
LOSAPPKSLWI I FIGLGSLLMVPVWKACLGLPPFMGALLGLGLVWLTSDWIHSPHGEDRY 
HL^PHILTKIDISSITFFIGIIJAVNALSFANLLTDFSLWMDKIFSRNWAIVIGLLSS 
VLUNVPLVAATMGMYTL PLDDT LWKL IAYAAGTGGS ILI IGSAAGVAFMGLEKVDFLWYF 
KRjpSWIALASYFGGLFSYFVLESLNFFI 

CPfejLSLOlS 1167027 1168898 

hypothetical protein 

krI^kkgklgaiwgi^ft-ssvagfskdltkdnayodlnviehlislkyaplp^ 
fgwdlsotc^arujlvleekpttnycqkvlsnyvrslndymag itfyrtes ayi pyvlk 
lsedgh vfwdvqtsqgdi ylgde i levdgmg i reai eslrfgrgsatdysaavrsltsr 
saafgdavpsgiamlklrrpsglirstpvrwrytpehigdfslvaplipehkpqlptqsc 
vlfrsgvnsqssssslfssymvpyfweelrvqnkqrfdsnhhigsrngflptfgpilweq 
dkfijgyrsy i fkakdsqgnphrigflr i ss yvwtdlegleedhkds pwelfge 1 1 dhleke 
tdalritothnpggsvftlysllsmltdhplotpkhrmiftqdevssalhwqdlledvft 
deOavavlgetmegycmdmhavaslqnfsosvlsswvsgdinlskpmpllgfaqvrphpk 
hqytkplfml i deddf scgdlapai lkdngratl igkptagaggfvfqvtfpnrsg i kgl 

SLtfiSLAVRKDGEFIErnXVAPHIDt^FTSRDUTTSRFTDYVEAVKTIVLTSLSENAKKS / 
■ EEC^SPQETPEVIRVSYPTTTSAS 

CPriil017 1166997 1169935 

lyEBlMetalloprotease 

VI I^KLILCNPRGFCSGVVRAIQWEVALEKWAPIYVKHEIWNRHVVNALRAKGifl F 
VEELVDVPEGERVIYS AHG I PPSVRAEAKARKLIDI DATCGLVTKVHSAAKLYASKOYKI 
ILIGHKKHVEVIGIVGEIVPEHITVVEKVADVEALPFSSDTPLFYITCTTLSLDDVQtISS 
ALLKRYPSI ITLPSS5 ICYATTNR0KALRSVLSRVNYVYWGDVNSSNSNRLREV/O.RRG 
VP ADL INNPEDI DTN I VNH SG D I AMT AG AST P EDWQAC I RKLSS L I PG LCVEND/I F AVE 
DWFQLPKELRCS 

CPn_1018 11693P5 1170629 

No robust homo log present in Genebank/EMBL as of 11/// 98 

rmsyfn-^qkns^rslgllakffsrllyrvffsfregiylfsslylkypr/ffydlgky 
vyslrhcpyakiigrlpgasllkegnvygetpwsvlakicqafdrtsodilvdlgcglgkv 
cfwfshwrcovigidnqphfirfssnmhrklssgfalfdteefkn\a^S(6asyvyfygs 
sfsrrllneiilklse^pgswisisfpldsfsrgkecfftekscsvr/pwcktiaykn 

IRKG3 

<JPn_10l9 1172146 117063S 

CTHt'.f) hypothetical protein 
[ H RRNI MTVSYOS 1 STPPP EGEFDI FVDCNATEEAVVAAEVQVALp/cEOYAMLRATS EL 
t:FG[LTyr;ECALTyALPPKEKPU?EE0FLVKNGILMP.3T5LPNLKfc0S0QT3LASHRNP 
LAOT:;T:J:;N5 , Pf'.KAJTETPSSiFPFFSCKAPECDdSVDKTFTVGWPKAOE00EA5AS0 
::OAOniVR:;Y:^GTIKRH^AKEKV^QSTK:^ETOKHTCTK:;DA^PMt-LY:;TLHKEVPO 
AL: W/JJJY.bEEUR DOROCIEG YECEOEOEE^K KKT PWCTVF.^ZfjT:; iTi'NQYY ES YT P I 
[PDPtVEFAL:;E::OI.JVL.AL;KRVTNLDV[ i R [CTELMKtJILKDflANDTMTRLEEP.ELMERE 
AHEI AA:;Y: IROAKYAKWI i \ t ATATLO 1 LT.A r A rMVG E I IJGDyl \t W-VQK I i^GPFKDATAK 
TFKKf;[';KV^r:;L::LMTKAi\:;KVHKLaE::AVRAVAEYnKFWRMI<y[)E\TRTfEEVKDrW 
K:;MI >Mn .! .N I L/.rrKHbAAK: :i .v«: 

<Tm. i hyj-ot t»-i pinr.-iii 

i :r ;n::tm:;: :wi ;<m: ;r-:vi .i .n<jdpy [ i -dapr: ;o\-,: ;y:: nvAi v f. ao k tj l p k f ftq k 
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sety^^Be 

PVtP^^^F 



DSEOOELLQSRREERSETY^^BEKK I ETrfVQ [KDLCKDLF^ODCDSNC-KQKKSPFQC 
DTSRKNRIAKAAQAVPV I P^^^TTLSYL^TKOC I L3DFSSYGCHKDSVESTQRELDA 
LHEKR I ET IKVZ IEKEKRERLWCSLSD I rGWLAPFVS IG IC I VAI LSGGGtFAFAGFFAG 
LISLVIKCLEKLKFWDWLEKHLPIKNEELRRKI TTI ICWVVYLTPVTLSICTLKVENLGF 
SPt IEGAIKGI0PAIESTMAALRCAILF90AEIYKLKGKLTK IQLDIELKSFDRDDHYER 
S0ELLDNMESSFEAL3R I LfJYMRELDQVy LH5LRC 



CPn_l0:i 



1174270 /1 17369!? 



.in:, t '"•••L;*.\iu,r;TJ: ! aa. '.".<'.•". ^-TFn ■n.E::Yf.E'jr/r"KN : 

TYOKIFKISSEDLEKV-r-KEGYHAYt^KDYAKJITV-FHWLVFKNPFVSKFWFSLGAoLHMS 
EOYSQALHAYGVTAVLRDKDPY PH^YAY ICYTLTNEHEEAEKALEMAWVRAQHKPL YNEL 
KEEILDIRKHK 

CPn_1022 1175/09 1174216 

CT863 hypothetical protein 
FS FFFY ALKLQ I MNMPVPS Aw S AN ITLK EDS STVSTASG I LKTATGEVLVSCTALEGS S 
STDALISLA^JGQIILATG^E/I*LQST^A/HQLLFLPPEWELEIQVVD[XVCLEHAETITS 
EPOETQTQSRSEQTLPC^SSSKOSALSPRSLKPEISDSKC^AI^PKDSAVRKHSEAPS 
PET0ARASLS0ASSSSQR3LPPQESAPERTLLEQQKASSFSPLSQFSAEKQKEALTTSKS 
HELYKERTODRG^REOHDRKHDQEEDAESKKKKKKRGLGVEAVAEEPGENLDIAALIFSD 
QMRPPAEETSKKETrFKIOCLPSPMSWSRFIPSKNPLSVGSSIHGPIOTPKVII^LRFM 
KLMARI LGQAEAEANELYMRVKQRTDDVDTLTVLI SKINNEKKDI DWSENEEMKALLNRA 
KE I GVT I DKEKYTWTEEEKRLLKENVOMRK ENMEK ITQMERT.DMORHLQEISQCHOARSN 
VLKLLKELMOTF I YNIiRP 

CPn_1023 / 1176008 1176331 

No robust homrflog present in Genebank/EMBL as of 11/7/98 
GLDFLEIFIMKfCyvTLS 1 1 FFATYC AS ELS A VTWAVPLS EA PG K I QVR PVVGLQ FQ EEQ 
GSVPYSFYYPYD^GYYYPETYGYTKNTGQESRECYTRFEDGT I FYECD 

CPn_1024 / 1177317 1176334 

xerD- Inteorase/ recombinase 

IFFFPWSL^SLKIAPLPILKLHSIJlShrrMPSTOFHTTILEQFSLFLSVDRGLCQQSIAA 
YRQDI SSF^I S AISS PQDI SQNSVY I FAEELYRRKEAETTLARRL I ALKVFFLFLKDQQ 
IJjPYPPILEHPKIWKRLPSVLTPQEVDAIJLAVPLCtlEJQ^PRliLA^ 
VSELCDLRLGHVSDDC IRVTGKGSKTRLVPLGSRAREAI DAYLCPFRDQYOKKNPHEDHL 
FLSTRGHkLE^SCVWRRIHNYAKQVTSKPVSPHSLJlHAFATH 
RIASTE^YTHVAADSLIEKFLAHHPRNL 

CPn_yD25 1177266 1178879 

pgi-yGlucose-6-P Isomerase 

3FSSYREiCrMERXRFIDCDSTKILQELAI>IPI^LTAPGVLSA£RIKKFSLXGGGFTF 
SFOTEiU^DAILAALISLAEERGLHESMLAMCXJGQVVNYIEGFPSEhRPAiJfrAT 
DSfiFTGEAEDIAVR^RVEAQRIJCDFLTKVRSQFTTIVQrGIGGSEI^PKALYRAL^ 
^ TBKHVHF ISNI DPDNGAEVLDT I DCAKALVWVSKSGTT I ET AVNEAFFADYFAKKGLS F 
DHFIAVTCEGS PMDDTGKYLEVFHLWES IGGRFSSTSMVGGVVLGFAYGFEVFLOLLQG 
3 1 ALQPNARENLPMLSALI S IWNRNFLGYPT EAVI PYS SGLEFFPAHLQQCCMES 
AQDGRRVGFSTSPVIWGEPGTNGQHSFFQCLHQGTDI I PVEFIGFEKSQKGEDI S 
TS SQKLF ANMI AQAI ALACGS ENTN PNKNFDGNR PS SVLVS SQLN PYS LGELLSYY 
tfVFCGLCWINSFDQEGVSLGKAIJ^VLELLEGADASNFPEAASLLTLFNIKFR 



_1026 



1178961 1179137 



CSFGFGKICEDRMFFIAVRSRGFLDIHG I LAARKGKQWKSTAGAWIGSRGAVFYSLVS 

^CPn_1027 1179172 1180755 

No robust . homo log present in Genebank/EMBL as of 11/7/98 
NMPGSVS SP PLS PVIVRERVPS SSGSDL IQPHAVLK I S I LI FALVT I LG I VLWLS SALG 
AL PSLVLTVSGC IAIAVGL IGLG I LVTRL I LSTI RKVDAMGYDAAVXEEQYLSRI RELES 
ENRE I RDRNRAVEDQCAHLSEENKDLRDPEYLHGMTERL I AS LEI ENQALVAENI LLKDW 
NASLSRDFRAYKQKFPLGALEPWKED I AC IMEQNLFLKPEC I AMVKSLPLETQRLFLYPK 
GFQSLV^FAPRSRFFCTPKYEYT^SRNENEDGKVAAVCARLKKEFFSAVLGACSYEELGG 
I C ERAVALKETL PLPEAVYDTLVQEF PNLLTAES LWKEWCFYSY PYLRP YLSVDYCKRLF 
VQLFEELCLKLFTTGSPEDOAL\^FSYYRimiPAVIASFGLPPPETGGSVTVLLPK0EN 
LLWSQIEVIATRYLKOTFVRWSEWTGSFEMMFSYNEMCKEISEGRIRFAEDYETRHSEEF 
PPSPLSEEGEGEEFLPPCSEEEVSVLERPDLDVDSMWVWHPPVPKGPL 

CPn_1028 1180995 11B1999 

mdhC-Malate Dehyrogenase 

F F LKGVRMAFKEWRVAVTGGKGQ I AYNF LF ALAHG DVFGVDRGVD LR I YDVPGTERALS 
GVRMELDDGAYPIXHRLR\rrTSLNDAFDGIDAAFLIGAVPRGPGMERGDLLKQNGQIFSL 
OGAALm'AAKRDAKIFVVGNPVhrrNCWIAMKHAPRLHRKNFHAMLRLDQNRMHSMlJ^HRA 
EVPLEEVSRWrWGNHSAKQVPDFTOARISGKPAAEVIGDRDWLENILVHSVQNRGSAVI 
EARGKSSAASAS RALAEAARS I FCPKSDEWFSSGVCS DHNPYG I PEDLIFGF PCRMLPSG 
DYEI I PGLPWEPF I RNK IQ I SLDE I AQEKASVSSL 

CPn_1029 1181987 1182844 

No robust homolog present in Genebank/EMBL as of 11-7/98 
R VFV I STMLWGVSMRQS FDELSQNAF KM I FNKQR FC F I FCSLCC FC FVFALF LKLC S RLA 
PEISLSTLGLGAFFCAFSVICASAI I VQFLLHKESQCETSKLCCAI KNTWSSLWLSLLVS 
MPFFIAMVAWTVAMLSSFLCSLPWVGKLFHTVLIFI PYLSATALILLFLGSFSCLFFC I 
PVLHNQ E3 1 DY R KL LEC FRGN I LRQF IG W I A LVPLALC SWLALDS FYLMTH LVE I AD I H 
TWSFLAOMFVLIVPIALILTPAVSFFFNFSFSFYLAKOEEEKALVK 

CPn_10";0 1183901 1182843 

predicted D-amino acid dehyrogenase 

FKVHFMRIAVUJAGYAGLSVTWHLLLHSOGTATIDLFDPIPLGEGASGMS^GLLHAFTGK 
KALKPPLADOC rNATHALTTEASKALNVP IVI SQG r LRPA IDEDQAOLFTERVEEFPKEV 
R-WWEKAHC R i s I r^MV \ PPNLGALF t K.'JGVTLMHDLY [OGLADAl'MKLi 3T0FYDEL I EDL 
AD I EEFYDH [ fVTPCr\NA:; rt.PELKDMPVNKVK^QLLEI.'JWPKDt^MLSFL* INAHKYMVA 
^r^OK^rP^■l[.A;ATFEMNOPECTPDPA^AYOElM^*PVL:;LFP r ;LKf)AOVLM^:YAl.»1RS^SK.'3 
Rl.rVt:JR[HF.KI.WF[A«[jU::KGLLYMr;iT^r»1t.AgAVI.r'iK.';TAYIAKEFLrr[ 

« .T'n_t')il [ | hs '',»;■; llfMO'^ 

.m:D-Ar'jiiuiit!/'<)triir.hin^ Aut. if^n r. : r 

i Ki-:rFMT:;RTK:;.iKNU rp t ALAt ;Mwr;r; i ra;r; f Fm.f'f/f imaatai :ai ;av i l.;w r ltgr; 
mkf [ArrrFRi l::t[rpdlkb: I ymy:;p.E';pv ;i-y f;pp ir;wr;Ywu:u i i- v :n\\:yav itmda 
i n\ i-i'ft/Fo x ; wr L pa iu x ;:: 1 1, iwvfmk r vr x<\ r hqa:; i i rn/ 1 t kk i r via iful 
takffkijwi-ktdi-t* :iiAVPKA0P::Lf ;.*"/::: ;qi y^PMf.vTf.WAF r< : i vs :avvm:;' ikaknp 
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VYFr^NAWrrTMLSrTGVMVLPAYL^AAFLFKLSKS 
SLWL I YAUGLKYLFMALVLLALG I PFY I DAGKKKKNJ 
FLFLTGR IKl 



iajhWrki 



IpVSLY IT3SVMQLAMLL 
IKAPLAMITCILGWY 
EIVCMTFICLLALTAI 



RFFLERGVLLRPLGNTLYVi 



EEDLR/T EY3HL0DALCL0E\! 



CPn_l032 US6153 U35566 

CTJ7} hypothetical protein 

r.,MAY^TPYPTr ( AFHT^^rnF.rDDCMrPOPFCTFr^D^AL[^AKtF^FTITVPYTSVLPKEL 

i" ;;;; vi v; "-.vf'. i i' ;AVi .i-.v :MA' ;A/u,.;h ;T!i ; \: Ar- ;:. ; h wikuFN; i« '/;waa 

ALGFLNFENAEPAKVN 

CPn_1033 1187656 1186187 

CT372 hypothetical procein 

^KKKDYSGEFLTTDTVDSIAFLPSEE^FCTrKTILFFRVKKKHYAFFYGEFMISFRFTX 
LSGLCALC I S SY AET PKETTGHYHRYKAK IQKKH PES I KESA PS ET PHHNSLLSPVTNI F 
CSH PWKDGI SVSNLLTSVEKATNTO I SLDFS I LPQWFYPHKAU3QTQALEI PSWQFYFS P 
STTWTLYDSPTAGQGIVDFSYTLIHYWQTNGVDANQAAGTA^ 

QT F PGD FLTLA IGQYS LYA IDGTLYDNDQYSG F I SYALSQNASATYSLGSTGAYLQFTPN 
SEIfCVQUSFQDSYNIDGTNFSIYNLTKSKYNFYGYASWPKPSCGDGQYSVLLYSTRKVP 
EQNSQVTGWSLNAAQHIHEKLYLFGRINGATGTALPI^SYVLGLVSENPLNWiSQDLLG 
IGFATNKVNAKAISNVNKLRRYESVMEAFATIGFGPYISLTPDFQLYIHPALRPERRTSQ 
VYGLRANLSL 

CPn_1034 1188589 1187732 

Predicted OMP (CT371) [leader (18) peptide] 

KTSWQKYKKYLSYS ILVQKIARYVMKTWLFFTFLFSCSSFYASCRYAEVRS IHEVAGDIL 
YDEENFWLI LDLDDTLLQGGEALSHS IWKSKAIQGLQKQGTPEQEAWEAWPFWIEIOEM 
GTVQP I ESAIFLLIEK IQKQG KTTFVYTERPKTAKDLTLKQLHMLNVSLEDTAPQPQAPL 
^KNLLYTSGILFSGDYHKGPGLDLFLEICTPLPAKIIYIDNQKENVLRIGDLCOKYGIAY 
■^TYKAQELHP P I YFDN I AQVQYNY S KKLLSNEAAALLLRHQMH E 

CPn_10>* 1190081 1188570 

aroE-Sfiikimate 5-Dehyrogenase 

|WQLPyLMVP I VHLQIWR F SM IYYGVSVMLCATVSGPS FC EAKQQ I LKSLHLVDI IELRLD 
^INE^IX5ELin*LITTAQNPILTFROHKEMSTALWIQKLYSLAKLEPKWMDIDVSLPKTA 
MKSHPKIKLILSYHTDKNEDLDAIYNEMIJVTPA£IYKIVLSPENSSEALNYIKKAR 
LL^PSTVLCMGTHGLPSRVLSPLISNAMNYAAGISAPQVAPGQPKLEELLSYNYSKLSE 
KSHIYGLIGDPVDRSISHLSHNFIX.SKLSLNATYIKFPVTIGEVVTFFSAIRDLPFSGLS 
MPLKT A I FDHVDALDAS AQ LC ES I NT LVFRNQK I LGYTTTDG EGVAKLLKQ KN I SVNNK 
H/^VG AfiGAA^I AATLAMQGANLH I FNRTLSSAAALATCCKGKAYPLGSLENFKTIDI 
PPB^^FPPIVMDINTKPHPSPYLERAQKHGSLIIHGYEMFIEQALLQFALW 
ESCDSFRNYVKNFMAKV 

1 CPn^l036 1191180 1189984 

aroff-Dehyroquinate Synthase 

GY^KjPCSCRSC 1 1 PTMLQTMMS ET I ITTPHWKLISNFFQKKLFSSISTAYPLVIITDVS 
VOffillG P I LDH I KMLG YQV I VLTF P PG E PNKTWET F I SLQYQLVDQNI S PKSS I IG IGG 
. GT&L^DMTGFLAATYCRGL P LYL I PTT ITAMVDT S IGGKNG INLRG I KNRLGT FYLPKEVW 
MC^FLSTLPREEWYHG IAEAI KHGF IADAYLWEFLNSHSKMLFSSSQI LHEFI KRNCQ I 
KAiftiVAEDPYDRSLRKILNFGHSIAHAIETIJ^KGW 

T P^Li IDQLERLLKRFNLPSTLKDLQS I VPEHLHNSLYS PENI I YTLGYDKKNLSQHELKM 
I^jgiLGRAAPFNGTYCASPNMEILYDILWSECHVMRHC 

CpH|037 1192286 1191123 

aroC-Chorismate Synthase 

LHFSRGSRRSFLEELLRTSVSRSHYLVKVMKNSFGSLFSFTTWGESHGPSIGWIDGCPA 
GLEbHESDFVPAMKRJIRPGNPGTSSRKENDIVQILSGVYKGKTTGTPLSLOILiOTDVDSS 
PYEJ^ERLYRPGHSQYTYEKKFGIVDPNGGGRSSARETACRVAAGWAZKFLANQNIFTL 
AYLSSLGSLTLPHYLKISPELIHKIHTSPFYSPLPNEKIQEILTSLHDDSDSLGGVISFI 
TSP r HDFLGEPLFGKVHALLASALMS IPAAKGFEIGKGFASAQMRGSQYTDPFVMEGENI 
TL&$>JNCGGTLGG ITIGVPIEGRI AFKPTSS ikrpcatwktkkettyrtpqtgrhdpcv/ 
A I RAVP WEAM I NL VLADLVLYQRCSKL 

CPn^'1038 1192750 1192199 

aroLiShikimate Kinase II 
WKLiLFtNVMTIIlXTGLPTSGKSSLGKA^ 
AYQE^KFS ECEAR I LETLP PEDAL I S LGGGTLMYEASYRAIQTRGALVFLSVELPlZyER 

lekrglperlkeamktkplsei lter i drmke i ady ifpvdhvdhssks sleqaso/dlit 

LLKS 

CPn_1039 1194011 1192665 

aroA-Phosphoshikimate Vinyl transferase 
yCFTMLTYKVSPSSV^'GNAFI PSSKSHTLRAILWASVAEGKSI IYNYLDSPKf EAMICAC 
KQMGAS I KKFPQ I LEI VGNPLAI FPKYTL I DAGNSG IVLRFMTALACVFSKZ ITVTGSSQ 
LQRRPMAPLLQALRNFGASFHFSSDKSVLPFTMSGPLRSAYSDVEGSDSO&ASALAVACS 
LAEGPC3FT 1 1 EPKERPWFDLSLWV^EKLHLPYSCSDTTYSFPGSSHPC^SYHVTGDFS 
S AAF I AAAALLSKSLQP I RLRNLD I LD IOGDK I FFSLMQNLGAS IQYDNEEI LVFPSSFS 
GGS IDMDGC I OALPI LTVLCCFADS PSHLYNARS AKDKESDR I LAITESLQKMGAC IQPT 
H DC LL VN P. 1 >* F L YC AVLDSH DDH R I AMALT I AALYASGDSRI HNTACVRKT F PNFVQTLN I 
MEARIEECHDNY3MWSTHKRKVFARESFC 

CPn_l04U U94S76 11*4073 

No robust homolog present in Genebank/EMBL as/of 11/7/98 

rpsoglflrtwspsssfrehtvcaapllyprrrspdylfsptgc/mstttvkhfihtasr 
wer/lke^vaj^^^iaowi^^ , lsflensgakkrsasehptevkecvlkhaaeefrhghyl 
kto 1 3 r i h ets l pd yt 3 kn llgg llt k yy lh lldlrtcr vlenjeys lsgqtl ktaa y i lv 

TYA [ ELRAIjELY PLYHD I LKEAQSK I TVKS 1 1 LEECCHLQEM/rELKDLPHCEELLGYAC 
OFEGELCljOFVEFiI.EOMrFDPSsTFTKF 
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CPn_L042 1196629 . 11759)4 

•bioD-dethiobiotin synthetase 
NRSPFTYFRANFFMQR 1 1 1 VG I DTGVCJCT I VSA I LARALNAEYWKP IQAGNLENSDSN IV 
HELSGAYCHPEAYRLHKPL3PHKAA0 IE5NVSIEESH ICAPKTTSNL I IETSGGFLSPCTS 
KRLQGDVFSSWSCSWI LVSQAYLGS I NHTCLTVEAWRSRNLN I LGMWNGYPEDEEHWLT 
OETKLP I IGTLAKEKErTKTI tSCYA£CWKEVWTSNHOG IQCVSGTPSLNLH 



bioF 2-Oxononanoate Synty 
PMLCOOFLIEALARRKSKHTYRSLSLNSHLI DFT3NDYLGFASS PELRKEY ITKLHAI ES 
LGATGSRLLTGHSQLCQRI EEQ^AYHNFESCLI FNTGYTANLGLLYALATDODRI LHDL 
YI HASI YOG IRi>SKAOSFPFNl^LNHL^KRIASSHLGRTFVCVESVYSLHGSVAPLQAI 
SEI£ERYSAYLIVDEAHAVGVTGDCGEGLVSALGWDKV^ 

s iucdylinfcrpfiyttaq/phaltai ELAYEHNQRAFNQREHLSALI hhfrekaonlg 

LOLMKD^^ITTPIQSICVSG^RARQAAI^IQNSGYDVRPIVSFTVKOREELLRICLHAFTJ 

tkneidhllhtleo r flcn^ssl 

CPn_1044 /l98700 1197699 

•bioB-Biotin SyntJ 

akhmreetvswsledirei yhtpvfelihkanai lrsnflhs elqtcyl is i ktggcved 
caycaossrykth\t/epmmkivdwerakravtlgatrvclgaawr.nakddryfdrvla 
mvks itdlgaevccaLgmlseeqakklydaglyaynhnldss PEFYETI ITTRSYEDRLN 

TLDWNKSG I STCCCGIVGMGESEEDR IKLLHVLATRDH IPE5VPVNLLWP I DGTPLQDQ 
PPIS FWEVLRT I ATARWF P R SMVHLAAG RAF LTVEOQTLCF LAGAN S I FYGDKLLTVEN 
NDI DEDAEM I KLLCL I PRPS FG I ERGNPCYANNS 

CPn_1045 / 1199602 1198901 

•conserved hypothetical bacterial membrane protein 
GTLPMNTSHR^LVFSYl^STFTLLLVLSNLVLSSKL I PTTF FNF 1 1 PGGLI LYPLTFL I 
SDWNE I FGPKKARVM I FS AF I ANLLASS IVQ I FMFFPVAS PEMQTAWHCLFDLSPLRFL 
ASU^IV^LDIVLYTFn<NRTPNSSLWLRShGSTWISQIPDTFIVI^ 
FPC/TLWIMFYSYIYKITFCVLTTPIJyiAVNTIRKFLGMPS^ 

CPh_lo/6 1200675 1199590 

♦Tryptophan Hyroxylase 
vhyce/tldpkyiu<ialklrqslslffonsoswraystpysyyriilokenkekoala 
rhkc/sii£ffknii*fvhli^lskn0rhx:stdmawstpffnp^wy^ 
ycp^ffldyleafgllsdfldhqavikffelethfsyypvsgfyaphqylsllqdryfpi 
[rtldkd^sltpdlihdllghvpwllwpsfseffinmgrlftkviekvoalpskko 
rli^lqsnliaivrcfwftvesglienhegrkaygavlisspqelghafidnvrvlplel 
^irlpfrrrstpqetlfsirhfdelveltsklewmldqglles i plynqekylsgfevl 



/CPn_1047 1200537 1201343 

I dapB-Dihydrodipicolinate Reductase 
FGSRNMGSSMHVGVIGCSGRTGKVIVSALEQSSEYTLCPGFSRSSALTLFQVIAHNDVLV 
DFSHPLLTKEWAHLLISPKPLI IGTTGFPGKCKEAHDSLEELTH IVPVWCPNASLGAY 
^KRLVMIXSOLCNPQFDIRIRETHHRYKKDSLSGTAODLLDTIC^VKQED^GEE^ 
ESKKTIEVQSSRVGDIPGEHEVAFISSGEQILVRHTVFSRI*A/FGRGILSIIXWLKTLW 
PQfGLYSLJGDTLELVLRNEHCLLKKTTDH 

CPyh_1048 1201588 1202604 

asd-Aspartate Dehydrogenase 
GERKGMRIAVLGVTGLVGQKFVAl^HKVrniDWVIAEW 

MPEMVRDLPIRKIEEVQSDIWSFLPSSAESMEAYCLSC^KVVFSNASTYRMHSSVPII 
/I PEVNSDHFQLLEEQPYPGKI ITSPNCCVSGITLALAPLRKFSLDHVH IVTLQSASGAGY 
PGVPSLDLIANTVPHIVGEEEKILRETVKILGSSKQPLPCKLS 

VTFSKDVDLDEILYSYQEK>WEFP^YOLYDNPWSPQARKHLSHDDMRVHLGPITYGGDF 
RTIKMNVLIHNLVRGAAGTLLASMENYFFDYLKREMCLR 

CPn_1049 1202586 1203914 

lysC-Aspartokinase III 

EGNVSKIVYKFGGTSIATAENICLVCDIICKDKPSFVWSAIAGVTDLLVDFCSSSLRER 
EE^RJCIEGKHEEIVKNLAIPFPVSTWTSRLLPYLQHLEISDLDFARrLSLGEDISASLV 
RAVCSTRGWDLG FLEARSV I LTDDSY RRAS PNLDLMKAHWHQLELNQ PSYI IQGF IGSNG 
LGETVLLGRGGSDYSATLIAELAP^TEVRIYTDVNGIYTMDPKVISDAQRIPELSFEEMQ 
NLASFG AKVLY P PMLF PCMRAG I P I FVTST FDPEKGGTWVYAVDKSVS Y EPR I KAL SLSQ 
YQSFCSVDYTVUK:GGLEEILGILESHGIDPELMIAQN^AA;GFVMDDDIISQEAQEHLVD 
VLSLSSVTRLHHSVALITMIGDNLSSPKVVSTITEKLRGFQGPVFCFCQSSMALSFVVAS 
ELAEG I I EELHNDYVKQKA I VAT 

CPn_1050 1203884 1204798 

dapA-Dihydrodipicolinate Synthase 

l^KTKSYSRHVGRrMHLLTATVTPFFPNGTrDFASLERLLSFQDAVCNGVVLLGSTGEGL 
SLTKKEKQALICFACDLQLK\ r PLFVCTSGTLLEEVLDWIHFCNDLPrSGFLMTTPIYTKP 
KLCGO I LWFEAVLNAAKHPA I LYNI PSRAATPLYLDTVKALAHH POFLG I KDSGGSVEEF 
OSYKSIAPHIOLYCGDDVFWSEMAACGAHGLISVLSNAWPEEAREYVLNPOEQDYRSLWM 
ETCRWVYTTTNP IG I KA I LAYKKA ITHAQLRLPLS I EDFDLErA/SPAVESMLAWPKLRTS 
VFSYS 

CPn_1051 120-1956 1205270 

No robust homolog present in Genebank/EMBL as of 11/7/98 
FFMTPKSIQCLHLrKTrDPVRKISPVTTKKSSFFROSLLRFLELFWMFLYC TRSIRFHCV 
HIATFICRGLILFLTTLFLSMIC ILHFITLPWICKEDPR I IRKNK 

CPn_1052 1205402 1206 1 6^ 

No robust: homolog present in Oeneb<ink/EMGL .is or IL/7/9R 

FF rOKHKYN^REK IK^ALR [CJ.^YC rTVFRNNF.'JLnCYDKI FY.'JLJCYVFNCtPNS LGRCR 

:TCFFR';KKTEVETKEVKrKPEIRP:iLFy;NDr , VKVAE^FPKRH.^LR:;L^.'K>:*.';rGNLCA 

c:;NFLD:;QML::RNF:;KKiw^:rrrFTR:;K:;T(:nAEy;:;EPKRrrAa:Yt tf v;LR^ 
o\/tac l u> ; r lk dv: j d: ;i i rt r at r; : ; r l^ ^ 

VRPDLKKKFRLKKCKD 

'.'I'Ii.I'J'j 1 IJt)r. t iH i;:f)'«V()l 

rk) tobn::r hnmoloij piv : :f : rit in ' ;. ■tn-l >.uik/F.Mr;l. t:; ot U/7/'iH 
kk:;::iji.nkak [tJiiNHLYLTruii^/ivAf.'i'iL: : , i , u:i,fw:;i:KA:;n[-:vLV , ('::KFRi:[r:(^EP 
::RLAr:;GND*rYY:;[V::ta'[(W-irrRYr:;i-y;Ri!n!'N[nM!ivAPKft^w[:;if(;Tiu-:AKErp<5 
: :.';koyaf rr. ltare: :i ,m t : ;ek e .amtfov: ; f-:v i ijnr : y: :o* tk'/tktni .k fsjym i r .: mwpf.F 
L:Lr;vKr;AF 
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.;pn K>VI L2070KJ 120046b 

Nn ffjbunr. homolog present in Genebank/E^^ras of 1 1/7/ 98 
•3RWI IHR F [MQVLLS PQLPP P PQH5VG5 1 SS PS KLRVLAI TFLVFCMLLL 1 3GALFLTLG I 
PG L3AA I SFCLG IGLSALGGVLMI5GLLCLLVKREI PTVRPEEI PECVSLAPSEEPALQA 
AOKTtJVOLPKELDOIXJTDIQEVFACLRKLKDSKYESRSFUroAKKELRVFDFWEDTLSE 
[ F ELPQ t VAQECWDLNFL I NGGRSLMMT AE3 ES LDLFHVSKRLCYL PSGOVRGEGLKKS A 
KrrVARrW.nLHCETHKVAVAFDRNSYAMAEKAFAKAL^ALEESVYRSLTQSYRDKFLESE 



;-:uoa 



i-: k k l i j [ ;a k k h w k k i ■' i ; k.a v iwme d 



■ cr/NLUiuw:.' 



:-vi-- wmi je :tfi i ki ;•: kkttfj .kp. ljm'K« \ai «\ ktt f kk k r ? ' k Kf jloav e eanar n l k 

'rVRDWDQEFOKAGERLEKLHALYPE^SVSIRENKIQETRSNLEKAYEAIEENYRCCVRE 
OEDYWKEEEKREAEFRERGNKILSPEELESSLEQFDHGLKNFSEKLMELEGHILKLiQKEA 
TAEVENKILSDAESRLEIVFEDVKEMPCRIEEIEKTIJ^MAELPLLPTKKAFEKACSQYNS 
CAEMLEKVKPYCKESLAYVTSKERLVSLDEDLRRAYTECQKRFQGDSGLESEVRACREQL 
RERIQEFETGGLDLVEKELLCVSSRLRWrECDCVSGVKKEAPPGKKFYAQYYDEIYRVRV 
QGRV^MSERIJIEGVQACNKMLKAGLSEEDKVLKEEEYWLYREERKNKEKRLVGTKIVAT 
QQRVAAFES IEVPEI PEAPEEKPSLLDKARSLFTREDHT 

CPn 1055 1209583 1210521 

No robust homolog present in Genebank/EMBL as of 11/7/98 

CKYLYHHSYPPPPDHSVGAFFCLSKFRVLAITFLVLGVLFLISGALFLTLGISGLSAAIS 

FG LG IGLS ALGGVLVVSGLLC LLAKREVPTVR P EE I PEGV5V APSEEPALQATQKTLAQL 

PKEL'TOLDRYIQEWSCLGKLKDLRCEDWIXKDAKEKWVF^FVWKDMKTEFVEWQIM 

DQEGWYLKCLIQE2^IGSTLFMSQVSLFKLWE^YLPSGDWGERLKKSAR£VVDRFM 

RRICDTRKVAMTFDRNAYGVAKTAFEKAFGALETCVYKSWTESYREAFCEYKKTKILRDE 

EKILRICYLELRR 

CPn_1056 1210482 1211228 

No robust homolog present in Genebank/EMBL as of 11/7/98 
GEDIKDMLSRVEEIEMMI^VIELPIXPIKQALEKAFVQYNSYKAKLTKVEPCFRESPAYI 
TSEERLQSLDGTLERAYKEYQKRFQEPSRLESEVSGCREHLREQVKQFETQGLDLIKEEL 
I FVSDVLFRKMVSCLVSTVHVPFMEFYYEYFELHRLRU^ACVMANAEIYS^ 
KETLEKAKAPREEEYWLLC EERKS KEKRL I LNK I EAAQQRVKDLEP P P I KETGKQKRKKE 
YSFFIRLKS 

CPn_1057 1211467 1213596 

CT356 hypothetical protein 

IIHFYFFNFAMPEPLYTNKLITEKSPYIXLYAHTPVNWYP^ 

GCKHSRWCQVMLQESYTNPEIAAMI^EYFVNVKVDKEELPWAKLYGDI^^^VSGDH^ 
ETVSWPLNVFLTPDLVPFFSVNYLGNEGKLGLPSFPQI IDKLHFMWEDAEEREALVEQAM 
KVLEIASFLEXSCVRKEII^ESSIJCRWAALYQDIDPHYGGVKAFPKRLPGIJXQFFLRYS 
LEjYQESRGLFFVDRSLSMVALGGVRDH IGGGVYSYT IDDKWL I PAFEKRLIDNALMALNY 
LE23?ACLGKEEVRGIGKQI LSYILSELYS PE^GAFYSSEQAENMGAGGQNFYTWSVEEIS 
NAILED AE I FCDYYG I SREGFFNGRNILH I PVHRE I EELSEKYHRS IEAI EDIVDRSRDI 
LKG^lRAQRS HRS KDDLSLT FNNGWM I YTF AY AGRLLGEVEY I E IGKKCG EFVRNSL YKHH 
ELY^WREGEAKYRASLFJ3YGALILGVLALYESGCGSFVLSFAEEIJ^QEVV^ 
FYiSVDGRDSTLLIKQS PLSDGET I SGNAL ICQCLLSLHL ITEKKHYLTY AED I LQ I AQAC ~ 
Atm^SSI^LLIASO^FSRKHVKVLIALGDQEDRSPVUCCLSGLFLPYI^LIWM^ 
QE^rrVLPEYEHCLIPKGIXrTATTIYVLEVDQCKRFKDLELFRRYLISL 

CPhJl058 1213742 1214836 

CTi"5'5 hypothetical protein 
EVKKLYQTLRG I VLVSTGC I FLGMHGGYAAEVPVTS SGYENLLESKEQDPSGLA IHDRI L \ 
FKvftEENVVTALDVI H KLNLLFYNSY PHLIDSF PARSQYYTAMWPWLESVI DEFLMVAD > 
AKft^R I ATDPTAVNQEIEEMFGRDLS PLYAHF EMS PND I FNVI DRTLT AQRVMGMMVRS K j 
VMEJCVT PGK IREYYRKLEEEASRKVI WKYRVLTI KANTESLASQI ADKVRARLNEAKTWD/ 
KDRLTALV I SQGGQLVCS EEF S RENS ELSOS HKQELDL IGY PKELCGLPKAHKSGYKLY] 
LLDKTSGS I EPLDVME SK I KQHLFALEAESVEKQYKDRLRKR YGYDASM I AKLLSEE" 
LFSIsL 

CPafl059 1214848 1215678 

kgsA-Dimethyladenosine Transferase 
VTRSS PAQLSRFLS E I QNK PKKSLSQNFLVDQN I VKKI VATS EVI PQCWVLE IGP^FGAL 
TEEUJ AAGAQVI A I EKDPMFA PSLEELP I RLE I IDACKY PLDQLQEYKTLGKGRVVANLP 
YHtTTPLLTKLFLEAPDFWKTVTVMVQDEVARRIVA^ 

IO/SSSCFTPKPQVQSAVIHMKVKETLPLSDEEIPVFFTLTRTAFCX3RRKVtJ^jyLKGLYP 
KEO^EQALKELGLLUJVRPEVLSLNDYLALFHKMQAG 

CPrei060 1217694 1215727 

dxs/tkt-Transketolase 
YKRFLYIHITKVMTSSSCPLLDLILS PADLKKLSISQLPGLAEEIRYRIlfeVLSQTGGHL 
SSNLGI VELT I ALHYVFSS PKDKF IFDVGHQTYPHKLLTGRNNEGFDH IftNDNGLSGFTN 
PTESDHDLFFSG HAGTALS LALGMACTT P LESRTHV I P I LGDAAFS CGETLEALNN I STD 
LSKFWI LNDNNMS I SKNVGAMSR I FSRWLHH PATNKLTKQVEKWLAiTI PRYGDSLAKHS 
RRLSQCVKNLFC PTPLFEQFGLAYVG P I DGHNVKKL I P I LQS VRNLPjFP I LVHVCTTKGK 
GLDQAO^PAKYHGVRANFNKRESAKHLPAIKPKPSFPDIFGGTLCELGEVSSRLHVVTP 
AMSIGSRLEGFKQKFPERFFDVGIAEGHAVTFSAGIAKAGNPVICalYSTFLHRALDNVF 
HDVCMQDLPVTFArDRAGLAYGDGRSHHGIYDMSFLRAMPQMIIGQPRSQWFOQLLYSS 
LHWSSP3AIRYPNIPAPHGDPLTGDPNFLRSPCNAETLSOGEDVLI IALGTLCFTALSIK 
HQLLAYGISATVVDPIFIKPFDNDLFSLLLMSHSKVITIEEHSIRGGLASEFNNFVATFN 
FKVDI LNFA I PDTFLSHGSKEALTKS IGLDESSMTNR I LTHFI^FRSKKOTVGDVRV 

CPn_1061 1217932 1217666 

CT330 hypothetical protein 

fgslmveihhkdpslkklfalqqsletlnslsdivatyea/ifsliyeglnkalrkdolcy 
llti vnskgellk3 psgdp i vqt fp i hphh 

CPn_l0^2 1219835 1213159 

xseA-Exodoxyr ibonuc lease VII 
KGFPVM:;3PPQAVA3LTER [KTLLESNFCQI rVKGE^JNVfJLQP^GHLYFGIKDSQAFLN 
f:AFFMFK:;KYYDRKPKDf]DAVIIHGKLAVYAPRCQYy6tVAi(ALVYACEGDLL0KFEETKR 
KI.TAEGYFATEKKKPLPFAPQC IGVIT3PTGAV [Q& ILRVLCRRARNYKI LV/PVTVQCN 
: ;AAI IE I r;KA I EVMNAENLADVL 1 1 ARGCG J I EDLWAFNEE I LVKA [ HAi?T I P IVSAVGHE 
TDYTLfM^FAl-IDVPJVP'rPiIAAAEIVCKSSEEOVOwEGYLRHLLJIIiJROLLTSKKQOLLPW 
HHKLDRAEFYTTAQOQLD;; lEIAIOKGVCX^K I H^SKfjKYDNI.SRWLOGDLVSPMTCRLGS 
I.KKMLJOALr.lfKAL^LOVRCIIQLKKrjLTYPRg/LXJArJOKLl^'WRQULDTLICRRLHYOKE 
KYintKIITHI.KHAIirn/f.EWLR^IIVOKt.F.LLCiR^.L^^GCELNLONQK [AYANVKETLATIL 
KI<RYEN:;VA]r^r;ALKEOUI:;LNPKIWLKRtr^/MLFDFNEN:;AMi:JVDSLODJARVRI'jLQ 

ix ;ea i ltvtn r e t f ?kl ekc; 



rnij -Triosephosphdt^^^—p 
FCRE^MRIKFRENKERKKrP^Bfc^MHKTt0EAKE\VCTI^JLLOGEPL~CTICIA 

3PFT3LRAIHEVI NTTGAF LWLGAfflNVH P ELSGAFTG E IJ L PM LK EVGV EFVL VGH S ER R 
HrFGESDAFIASKVKSVAOACLVSvLCVGESLEVREEGKAHQVIKKQLLLCLECMDNGSE 
FL I AYEPVWAIGTGKVAEA3DVQD r HMFCREWAER FSEATAEE 1 3 1 LYGC5VKVDNAQR 
FGQCSDVDGLLVCGASLECQSFFEVAKNFNV 



CPn_lOS4 

FY--"/Mp- r '* 



122/7 1220895 

;m-:: v -- ivnr.rivurr-r; 



CPn 1065 1221140 1220928 

No robust homolog /present in Genebank/EMBL as of 11/7/98 
RHRLGRKRRTSDPCFLF^Fo I PEESLPPDSCRLNQMPKH EHLPS I LLKKP 1 1 DYLK ITS I 
YEKAIFNTGLP 

CPn_1066 / 1221132 1221488 

No robust homo/og present in Genebank/EMBL as of 11/7/98 
SMSLNKE IGMTVLBYAFLF I FLFLCVI LCGL I LVQES KSMGLGS SFCVDSGDSVFGVST P 
D I LKRVTSWCAVAFCIGCLLLS FSTNLLGKKLDAKEFLL PAAEESDTQAS S ESVEADES 




CPn_106/8 1223267 1222365 

rnhB-Ribonuc lease HII 

MSCMPPPFVVTLTTSA0N^LPXCI^EKNFIFS0FONriWQA^SNrr\TCTLYPS 
KGSEEFIEFFLEPEILHTFTHARVEQDLRPRLGVDESGKGDFFGPLCIAAVYASNAEILK 
KLYEwKVODSKNLKDTKIASLARIIRSLCVCDVI ILYPEKYNELYGKFQNLNTLLAWAHA 
TVI/5£APKPAGDVFAISDQFAASErrLIJCAU3KKETDITLIQKPR^ 

QSIOKLEEQYQVQLPKGAGFNVKAAGREIAKORGKELLAKISKTHFKTFDEICSG 



^Pn_1069 1223507 1223941 

.yf gA-HTH Transcriptional Regulator 

/VIMQEHIHKEU,HLGEIFRSSRESQSLSLKDVEAATSIRYSCLEAIEC^LGKLISPVYA 
QGFIKKYATYJjGL^DSII^QEHPYVMKIFKEFSDHNMEMLIJIJLESMGGRIJSPERAIHSWS 
NLWWAGLI I IGG IMVWWLGSLFS I F 

CPn_1070 1225523 1224144 

No robust homolog present in Genebank/EMBL as of 11/7/98 

RRSLMTFPCGNCNCT/RETPPPNPGGEDIPLQEGGQSGSC^RVITG^PGTGGREMGISL 

GSDNVLGMVEQAGSLU^LDSARJ^RLGHYCYKTC^ 

ETVDDPDNPSAQFTOLIQQYGPICVGMSFWLPHCTCKIECCEPLGDGDKQET^GCKL 

HRELLKAAQ PRCMGES LVKLLQNNGLG EDMQQT PPWS L I LQAVSEGALS FVT S SDN P PTC 

WILQPEQQPCPPPPTDEEQt^AVGGAPAPC^KKHPAQECRVTCKLNFRTIXQKI^RLEV 

LSLESGYKGPLGQAAKQIVDLIKKSLKRLVASDIATFLGPGIGI^LESQWEVLVLUILL 

SKGYLPLDPI^PEQTVLDPRVC^PV^RILRK\/LVTTT 

WDDDEI ERDG IVTGGGFG I PCQCLRCWRKLPTEKRPNRWL 

CPn_1071 1227336 1225885 

No robust homolog present in Genebank/EMBL as of 11/7/98 
KGTTMVC PNNSWFRMCGNFNCEWVEVTTTEETTROS ASD I SEEAGSSGGAAP ITTQPTKI 
TKVEKRVQFNTAQGDEST I HM I QEAGELVDS I LS HRRTQGCT EYC YDS Y ATGCGO RCGS F 
GRL ICGTYKACCLDREDNQVAG LVHEC EQTHG P I AVALAAKTMGLNLMELVEKNT ILSEE 
QKNEFRQHCSEAKTQLYGTMOSLSONFFLEGVNS I R ERG LDDS LVQ AVL S F I ATRSWEKT 
I ESEEASGTSS ASNSTR I PACY I LNTS PLTTS RLSCGSRDARRPSSVGAEPQ YVAKKYND 
NGMAKQLGK IQVTNLKTGDFSALGPFGLLIVKMLNSFLLS ASQSTSS I LKHTGGE ICYTC 
PNFRDIVVLLMLAIGYCPANTDETSVVDIHMIDDPIMTIFYRLQYSYRTGKTSASFLKKK 
PSLVRQESLDCPTPAESVPLMSSLEEEDENEDDDEDGNLAYQQRILECSGHLQTLFLGIK 
INKE 

CPn_1072 1227924 1228835 

No robust homolog present in Genebank/EMBL as of 11/7/98 
KKDYI LHANWCCWKQMLK IOKKRMCVSWITVG AI VGFFNS ADAAPKKKK I P IQ I LYS FT 
KVSSYLKNEDASTIFCVDVDRGLLQHRYLGSPGWQETRRRQLFKSLENOSYGNERLGEET 
LA I DI FRNKECLES EI P EQMEA I LANSS ALVLG I S S FG I TG I PATLHS LLRQNLS FQKRS 
I ASESFLLK IDSAPSDASVFYKGVLFRGETAI VDALSQLFAQLDLS PKK 1 1 FLGEDPEW 
QAVGSAC IGWGMNFLGLVYYPAOESLFSYVHPYSTATELOEAQGLQVI SDEVAQLTLNAL 
PKMN 

CPn_1073 1229011 1229832 

Predicted OMP (CT37L) 

MRRYLFMVLALCLYRAAPLEAVVIKITDAOAVLKFAREKTLVCFNIEDTVVFPKQMVGQS 
AWLYNRELDLKTTLSEE0ARE0AFLEWMGISFLVDYELV3ANLRN*yLTGLSLKRSWVLGI 
SQR PVH L I KNTLR I LRS FN I DFTSC P A I C EDGWLSH PTKDTT FDOAMA I EKN I L FVGSLK 
NCQ PMDAALEVLLSG I SSPPSQ 1 1 YVDQDAERLRS IGAFCKKAN I YF IGMLYT PAKQRVE 
3YNPKLTA rQWSQ I RKNLSDEYYESLLSYVKSK 
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CRNAs 



CRNA * Begin End Type Codon 

L 89657 89728 Thr GGT 

2 90993 91070 TfD CCA 
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VM- 


' "AT 




:yb075 


296147 


Vdl 


TAC 


6 . 


296151 


296224 


Asp 


GTC 


7 


409848 


409922 


Pro 


TGG 


8 


462141 


462214 


Arg 


CCT 


9 


672236 


672318 


Leu 


CAA 


10 


677264 


677337 


Arg 


TCG 


11 


739403 


739486 


Leu 


CAG 


12 


781610 


781680 


Gly 


TCC 


13 


784822 


784896 


Glu 


TTC 


14 


784922 


784994 


Lys 


TTT 


15 


836119 


836191 


Ala 


GGC 


16 


. 843926 


843999 


Pro 


GGG 


17 


877400 


877473 


Arg 


ACG 


18 


1085605 


1085676 


Gin 


TTG 


19 


1142034 


1142118 


Ser 


TGA 


20 


1175863 


1175944 


Leu 


TAG 


21 


1230028 


1229942 


Ser 


CGA 


22 


1137462 


1137389 


Val 


GAC 


23 


1030603 


1030533 


Cys 


GCA 


24 


1000022 


999949 


His 


GTG 


25 


961607 


961536 


Gly 


GCC 


26 


807413 


807341 


Arg 


TCT 


27 


786780 


786708 


Thr 


CGT 


28 


715971 


715889 


Leu 


TAA 


29 


708441 


708354 


Ser 


GCT 


30 


680259 


680178 


Leu 


GAG 


31 


631445 


631373 


Phe 


GAA 


32 


626987 


626901 


Ser 


GGA 


33 


293477 


293405 


Thr 


TGT 


34 _ 


293399 


293317 


Tyr 


GTA 


35 O 


269142 


269070 


Ala 


TGC 


36 "Z 


269065 


268992 


He 


GAT 


37 


164389 


164318 


Asn 


GTT 


38 


87522 


87450 


Met 


CAT 



IS 



